Large CSV with many columns comparison very slow.

Aaron · 06-May-2016, 09:50 AM

I should add: since you are switching computers, it is good to test both versions to note how each performs on the same hardware. This can be done by using the appropriate setup.exe and selecting a Portable Install option. You can create as many Portable Installs on your Desktop as you like, and each is a single directory install that do not interact with each other.

Otherwise, if working with just one BC3 and one BC4 install, you can install them normally, as they install to Beyond Compare 3\ and Beyond Compare 4\ directories respectively.

fphillips · 06-May-2016, 08:37 AM

Ok. I'll check it out at home and generate some dummy data to see what happens. Shouldn't be too big of a deal to do.

Thanks for the info.

Frank

Aaron · 05-May-2016, 04:59 PM

Hello,

You would get access to more memory if you were able to update to BC 4.1(.5), but 4.0.7 was our last 32bit only release. I'm not certain if this would improve performance in your case without access to more specific files or information, but if you have a test environment where this could be tested, it would be a good first test.

fphillips · 05-May-2016, 02:16 PM

Would love to upgrade but we're a controlled corp so only approved applications can be used and currently that is BC3. Don't even have access on my laptop to install anything outside of the approved list. Working on that, but wont be anytime soon enough to help.

Keyed field is default to the first column, which works for these files as the key field is the first column. This isn't always the case and a lot of times there are multiple keys. We do use the unimportant and key field options a lot as a lot of our work is maintenance, bug fixes, and enhancements, so we are constantly having to mark new fields and fields where calculations have change as unimportant. Lets just say we use the little referee button a lot.

Mostly I was hoping that there was a setting I couldn't find that would allow for more memory usage. It seems that with only 80 meg of memory used to compare two 200+meg files indicates a lot of disk usage. Since we have about 59gb of free memory was hoping to be able to up the program usage to the max of about 3 gb for a 32bit process.

Anyway, looks like we're going to have to live with it for now and hope we don't encounter too many lines of business that put out these massive files.

Thanks for the quick reply,

Frank

Aaron · 05-May-2016, 10:59 AM

Hello,

Thanks for the feedback. The first thing I would recommend is trying the trial of BC4, which you can install without altering or removing BC3 (on Windows). BC4.1.5 has an improved Table Compare (renamed Data Compare) and 64bit support which may help with large files.

For BC3's Data Compare, different factors can impact performance. The number of columns could be an issue, but also the data in the columns. Would your additional columns contain the same amount of text, making the files much larger? Or would the additional columns be much smaller? Are you using the default Key (column 1)?

The Text Compare could also be configured to ignore defined grammars. Would the new data be definable by a Regular Expression or set character positions? We have a guide on defining unimportance here: http://www.scootersoftware.com/suppo..._unimportantv3
Once Unimportant, you can Ignore Unimportant Differences and the rest of the lines would be compared.

Large CSV with many columns comparison very slow.

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: