Delete one text list from another.

Collapse
X
 
  • Time
  • Show
Clear All
new posts

  • Indent
    replied
    Hi Aaron

    I am sorry for taking your time up with this.

    Can I just make it clear here that I have no trouble loading larger files in for the normal functions of BC3. It’s just with the topic discussed here that I am having trouble.

    Lap top about 3 years old.
    Intel Core2
    T5500 @ 1.66GHz
    2GB RAM
    XP Pro SP3

    They do load in ok but BC3 seems to lock up, I leave for up to 15 minutes for a 3.4MB (3 point 4 not 34) List A file together with anything up to 2MB List B file. I have to crash it to close it.

    As I say duplicates are ok as I have another program to sort that, but thank you anyway.

    Leave a comment:


  • Aaron
    replied
    Hello,

    Our max file size test are available here:
    http://www.scootersoftware.com/suppo...z=kb_maxfilev3

    Currently, we have tested upwards of 500meg files in the Text Compare. These tests are still limited based on the hardware you are using. How old is your computer and what kind of processor/ram are you using?

    Also, the "above method" in the KB article is specifically for RESX files. You would need to provide your own duplicate removing command line program, which can then be plugged in and used automatically.

    Leave a comment:


  • Indent
    replied
    Hi Aaron

    Well,…. thank you very much you worked it out !!!

    Yes it made all the difference changing the “Never Align Mismatches” although in my version it is called “Never Align Differences”.

    I did change the last example to contain duplicates as I thought BC would do that automatically but it is no problem as there is a workaround and I can also remove duplicates with another program beforehand.

    The workaround, if you are interested, is to keep saving and reloading the lists, eventually all the duplicates get picked up in the method you described.

    Unfortunately even though the method you kindly provided does work it is so incredibly slow it is almost unusable even with very small text files of only 5MB or so. I have some text files that are 200MB !

    You ask what I am doing “In the real world” ha ha ! Well I admit this isn’t probably a common thing so don’t feel bad about it. I use you program as everyone else does I suppose to compare 2 documents that have been written or changed in 2 different places. I sometimes have a copy on my lap top and one on my computer so things can get messed up. I use your program to compare the 2 versions and pretty much just use the merge function.

    This new idea or problem is I am making text lists of words for specific subjects. These are growing in size and I realise that many text files contain the same words. I thought I could reduce the size of all these files by removing the contents of the English dictionary from each one and that should leave me with only words specific to a given subject. So once you remove the dictionary you are left with slang terms and unique terms to that subject.

    I must admit I didn’t think comparing text files on a computer would be a problem but I simply cannot find any software anywhere that can do it. I know yours can but as I say it cannot handle text files of more than a few hundred KB, well not on my computer anyway.

    Basically what I am looking for is a “Find and Replace” found in all text editors but being able to load a text file in rather than a single word. Then simply replace with a “blank”.

    Out of interest do you know the maximum text file size BC3 can handle is please ?

    Thank you for your help.

    Leave a comment:


  • Aaron
    replied
    Also, what kind of files are you comparing? We are always interested in hearing how our customers are using our application in real world scenarios.

    Leave a comment:


  • Aaron
    replied
    Hello,

    If you enable Never Align mismatches, and use the Sorted file format, I believe this should align as expected with the initial example.

    With the new example, however, you have introduced the concept of duplicate lines that you wish to remove. Unfortunately, we do not have a method of automatically removing these at this time. As a workaround, you can use our Replace dialog to manually replace all instances of specific text (line 8) with a blank line to remove them.

    You can also create your own custom external conversion. If this could remove all duplicate entries from a file, and sort the remaining unique lines, you could then continue to use the above method:
    http://www.scootersoftware.com/suppo...rnalconversion

    Leave a comment:


  • Indent
    replied
    Hi Aaron

    Thank you for your reply and help.

    I tried what you said and I can see what you mean, however it didn’t really work. It did remove some but in cases where there is more than one of the same lines on List B it doesn’t seem to work.

    I did mess about with this…for some hours and discovered that if I changed this setting ..

    Rules / Alignment / and set it to Alternate Method.

    It seemed to work better.

    I have made a sample list for you below with very few lines that seem to throw the program. They are deliberately messed about to show you the problem.

    I think this is obviously a function this program isn’t meant to do, I guess it is an unusual request. So rather than me wasting your time trying to make it work could this perhaps be a feature request for a later version ?

    Could you please make an option to allow 2 lists of different sizes to be compared and then a button to remove all the “same” matching lines / words from list B ? Also make it so they don’t have to be aligned. As mentioned before almost like an automatic “find and replace with blank”.

    Thanks

    Here’s a sample for you to play with.

    List A

    line 1
    line 2
    line 3
    line 4
    line 5
    line 6
    line 7
    line 8
    line 9
    line 10



    List B

    line 1 This line should be left after filtering. 1/4
    line 6
    line 4B This line should be left after filtering. 4/4
    line 1
    line 2
    line 3
    line 1
    line 7
    line 8
    line 6
    line 7
    line 8
    line 2
    line 5
    line 6
    line 2B This line should be left after filtering. 2/2
    line 1
    line 7
    line 8
    line 8
    line 2
    line 6
    line 7
    line 3B This line should be left after filtering. 3/4
    line 4
    line 5
    line 6
    line 1
    line 3

    Leave a comment:


  • Aaron
    replied
    Given your example, you should get as you expect by using the Show Same filter, and deleting the Line 1-4 on the right side.

    The issue I believe you are running into is if line1-4 is not in the same order on both sides where List A is:

    line 2
    line 3
    line 4
    line 1
    line 5
    line 6
    line 7
    line 8


    To help with this scenario, your files can be sorted. If your data is only line by line, and it is ok to sort every line independently and alphabetically, we have a Sort file format that can do that. Go to the Session menu -> Session Settings, Format tab, and switch from Detected to Sorted for both sides.

    How does that work for you? If you are still having any trouble, would you be able to post or email us a pair of example files? Our email is [email protected] and if you email us please include a link back to this forum post.

    Leave a comment:


  • Indent
    started a topic Delete one text list from another.

    Delete one text list from another.

    Hi

    You will have to forgive me as I am a bit of an idiot as I can’t seem to do the simplest things sometimes !

    I am trying to use your software to compare two text lists and subtract one from the other.

    I managed to work out that if I used the “Show Same” view then delete the displayed lines that does work….a bit.

    However what I am trying to do is simply remove the same words or lines that are on list A from list B.

    Your program seems to only check for lines that are in a similar position and not search throughout the entire list. I guess what I am trying to do is a bit like an automated “Find and Replace” where the entire text of list A is compared to the entire text of list B.

    Sample.

    List A

    line 1
    line 2
    line 3
    line 4
    line 5
    line 6
    line 7
    line 8

    List B

    Other Text 1
    Other Text 2
    Other Text 3
    Other Text 4
    line 1
    Other Text 5
    line 2
    Other Text 6
    line 3
    Other Text 7
    Other Text 8
    line 4
    Other Text 9

    What I am trying to achieve would be an output of …

    List C

    Other Text 1
    Other Text 2
    Other Text 3
    Other Text 4
    Other Text 5
    Other Text 6
    Other Text 7
    Other Text 8
    Other Text 9

    So List A is subtracted from List B to give me List C.

    Can anyone help me please ?
Working...