PDA

View Full Version : BC2.5.1 File Viewer alignment


chrisjj
16-Nov-2007, 04:39 AM
I'm finding even 'through' Standard Method giving poor results e.g. missing the match between the blue-ringed strings in:

http://img164.imageshack.us/img164/5683/image16sk0.th.gif (http://img164.imageshack.us/my.php?image=image16sk0.gif)

Is this to be expected?

(Alternate Method succeeds here, but at cost.)

chrisjj
16-Nov-2007, 06:24 AM
Update: I find reducing 'Skew Tolerance' (sic) to 1 solves this case...

... but greatly worsens many others. E.g. loses the (six) matches shown by ST=4000 in a 200 lines v. 2000 lines compare (files available*).

BTW, I find this UI rather obstructive to trialling of these settings. Edit Current Rules... needs:

1 A key command to open it
2 Retention of the user's tab selection
3 An Apply button
4 No sneaky things like moving the Standard Method slider when removed from the users sight! (Upon OK click.)

Chris

* Odd that this forum won't let me attach them to this message.

Aaron
16-Nov-2007, 12:34 PM
It is expected behavior. The higher the cost of the alignment, generally the better it is.

You can try tweaking the settings a bit to get better results. Perhaps turn off Align Similar Lines in that example. And/or change the skew tolerance.

Aaron
16-Nov-2007, 01:54 PM
A few more bits of information:

Standard method often does better with source code than it does data dumps (it looks like your example is a tab delimited set of information. similar text throughout). Alternate method works better here.
Also, if you could include the screenshots showing your line weight and Importance tab, that would help as well.

For this example, you may want to consider using the Data Viewer Plugin, in delimited mode, with Tabs set as the delimiter. Does that present your data better? (include screenshot)

chrisjj
17-Nov-2007, 01:20 PM
> It is expected behavior.

I'm surprised.

>~~> The reason it is not aligning on this line is
>~~> because line 5 is aligning to line 1 on the right.

I see no such match. Did you mean "5 to 2"?

>~~> Decrease your tolerance down to 10,50, or 100.

That makes no difference - as expected, since already the effecive value (<= the number of file lines) is 7.

> Perhaps turn off Align Similar Lines

That loses all aligns, since all the lines are only similar.

Craig
17-Nov-2007, 02:54 PM
Chris,

Send a copy of your files to support. We've tweaked the "Align similar lines" algorithm for BC3, so we'll be able to tell you whether the alignment will improve in the future. Short of the improvements there, the behavior is expected in that our "Align similar lines" algorithm is a heuristic which is going to occasionally get things wrong. That's why we include the "Align Manually" command.

The "Alternate Method" is not really succeeding in this case. It doesn't do any similarity matching at all, so the only reason the lines you're talking about are lining up is because they're both line 3 in their respective files. That's the same reason they start lining up if you reduce the skew tolerance; that's effectively disabling the alignment algorithm. The data viewer probably won't help in this case either, since it doesn't do closeness matching either.

You might be able to improve the alignment by marking whitespace and the trailing block of numbers as unimportant.

chrisjj
18-Nov-2007, 07:10 AM
> Send a copy of your files to support ... so we'll be able
> to tell you whether the alignment will improve in the future.

Thanks, but here the present must take priority.

> The "Alternate Method" is not really succeeding in this case.

Ah, yes a mere coincidence of the line sequence.

> It doesn't do any similarity matching at all

Thanks - I'd missed that in Help.

> if you reduce the skew tolerance; that's effectively
> disabling the alignment algorithm.

Now I twig - thanks.

> You might be able to improve the alignment by marking
> whitespace ... as unimportant.

No improvement found.

> marking ... the trailing block of numbers as unimportant.

How? I see no facility to do that. "Text matching \d+" marks any block. "Text matching .*(\d{12})" marks also up to any block.

Craig
18-Nov-2007, 09:07 AM
>> "Text matching \d+" marks any block. "Text matching .*(\d{12})" marks also up to any block.

Text matching \d+$ should do it.