Calculating percentage from the report

**Chris** · 08-Nov-2017, 10:18 AM

Beyond Compare doesn't provide a report that lists percentage of matching lines.

You can obtain the total number of lines in each file by adding up the summary counts.

Left file lines = same lines + unimportant left orphan lines + unimportant difference lines + important left orphan lines + important difference lines

Right file lines = same lines + unimportant right orphan lines + unimportant difference lines + important right orphan lines + important difference lines.

Unimportant difference lines are those that exist on both sides but with changes in whitespace, character case, or comments, so you can probably consider them a match.

A percentage of original file lines that exist in a possible copied file might look like:
(same lines + unimportant difference lines) / total left side lines

A percentage of the potentially copied file lines that are from the original file might look like:
(same lines + unimportant difference lines) / total right side lines

Note that Beyond Compare doesn't detect moved lines, they're shown as adds and deletes. This means if the copied code had a function from the beginning of the original file moved to the end, it won't be reported as same lines.

There are dedicated tools for detecting source code plagiarism, they might work better than a general purpose comparison tool like Beyond Compare. A quick Google search for "source code plagiarism detection" turned up MOSS (Measure of Source Code Similarity) as an example.

**Roger Donnay** · 08-Nov-2017, 05:00 PM

This is good information. Thank you.
The expert on the other side used Beyond Compare to get a percentage but he didn't clarify how he arrived at that percentage and it just didn't look right to me.

Calculating percentage from the report

Calculating percentage from the report

Comment

Comment