As I'm continuously working on File Formats (and a long list I still want to do) I'd like to have a little more insight into how the Grammars are processed, and whether or how techniques I'm using now may impact efficiency.
My current approach is as follows:
Specific questions:
It would be great to have some insight in these issues. My File Formats tend to become rather large, and while maintainability is an issue (and being able to add notes would help a great deal!), for the larger Grammars performance becomes an issue as well. If I knew more about how these are processed I may be able to better balance the techniques I'm using.
Thanks.
My current approach is as follows:
- I put Grammar items that need to be processed first earlier in the list (this mostly works)
- I separate Grammar items for the same "element" type into logical items, rather than batch them all together - for instance, I'm working on a Grammar for Apache 1.3, where I have a large number of "directive" Grammar items, one for the core, and one for each module; several modules have only a single, or just two or three directives. While I can't add a comment to each item (yet?), as long as I keep them in alphabetical order by module, it does at least help for further maintenance and will help for deriving a second File Format for Apache 2.x.
- When one logical "element" type needs mostly plain text items and a (small) number of regular expressions, I tend to put the regular expressions in a separate grammar item
Specific questions:
- How would a grammar set up like described above be processed?
- Would separating an element type in a large number of Grammar items have any negative impact - on startup? - when applying to a file?
- Does a mix of "basic" and "list" items for a single element type have any impact?
- Does it help to separate out the regular expressions from a list with mostly non-REs?
- Does it make any difference whether a "list" item is sorted alphabetically?
- What happens when several lists for the same "element" contain identical strings?
It would be great to have some insight in these issues. My File Formats tend to become rather large, and while maintainability is an issue (and being able to add notes would help a great deal!), for the larger Grammars performance becomes an issue as well. If I knew more about how these are processed I may be able to better balance the techniques I'm using.
Thanks.
Comment