Find identical entries in CSV/XLS file

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • JR_ESB
    New User
    • Jul 2016
    • 1

    Find identical entries in CSV/XLS file

    Hi everyone,

    I have two files:
    - 1 CSV file that contains a long list of items (approximately 4 MB)
    - 1 XLS file that contains another list of items (much smaller, about 300KB)

    I would like to compare these files with each other and see if items from a column in my xls file are also found in the csv file.
    The tables do not have similar layouts.
    E.g., in column C of my semicolon-separated CSV file is a list of license plates followed by a $ sign (for whatever reason), my XLS file contains just the license plates in column A.
    I would like to have some kind of automated task that checks the CSV file for plates found in my XLS file.

    I tried to solve this with Beyond Compare - I am a novice user but have read the documentation and watched tutorial videos.

    The software always compares the files line by line which is not what I need - I want it to alert me if there is an item in both lists.

    I am running Beyond Compare Pro 4.0.7 (Build 19761) on a commercial license at work.
    Any ideas would be greatly appreciated.

    If you need any other info from me, please let me know.

    Thanks,
    Julian
  • Aaron
    Team Scooter
    • Oct 2007
    • 16000

    #2
    Hello,

    Ideally, the Table Compare can have the columns configured so the same data is aligned. Even if data appears in different columns (Col1 vs. Col3), this can be re-aligned in the Table Compare's Session menu -> Session Settings, Columns tab. Here, each cell represents the column header, and you can shift them up or down to line up the Left File Col1 with Right File Col3.

    This does assume the data is formatted identically in Col1 and Col3. Otherwise, it will still be marked as a difference.

    To find and align rows, you'll want to define a Key column. The default Key is Column 1, which is sorted and aligned by, but you can right click any Column header to set as Key, Standard, or Unimportant. A good Key would be a unique identifier, like the License Plate, an ID, or a combination of two columns together both marked as key (FirstName + LastName).

    If you can remove your $, then I think aligning Col1 to Col3 and marking it as the Key would work like you'd need. The data in the cell needs to be identical in order to work as a Key, however.


    The alternate method is to use the Text Compare. This wouldn't sort the lines, but would align similar text and could ignore the "$" character. It needs an additional download File Format, here:
    http://www.scootersoftware.com/downl..._moreformatsv4
    Aaron P Scooter Software

    Comment

    Working...