Different user, use different methods to make a list. Some uses doors bullets, some use tab with ascii bullets etc.
This inconsistency makes absolute detection of a list a challenge. Has anyone developed a dxl that has a high probability of detecting the various forms of list that they are willing to post?
llandale 270001QM9N3035 Posts
Re: Detecting List2013-01-07T15:01:19ZThis is the accepted answer. This is the accepted answer.No, but I recently had a similar issue trying to find headings in Object Text. I wonder if the following is useful. Looking at raw text...
- is there more than 2 EOLs
- is there a TAB in the first few characters
- is the first few characters dominated with white-space
- does a RichTextParagraph have characteristic "bullet"
Mathias Mamsch 2700025PVX2386 Posts
Re: Detecting List2013-01-11T10:59:55ZThis is the accepted answer. This is the accepted answer.I wrote an RTF parser once to handle a complex bullet point manipulation task. At the moment this one handles fi, li and pard tags to track the indentation of the paragraph, which is probably an important point for detecting lists. You can use the code with a loop that handles certains states, e.g. I am at the begin of a paragraph, etc. I always wanted to put a high level interface (flow document) over it, but I never came around to do so.
Would you be interested in having that RTF parser open source? Still looking for people who would be willing to commit to an open source DXL library. Sorry, that the library is not well commented, but I guess you will get the functionality pretty fast. You can use the nextToken() function in a loop to parse the RTF and then roam around in the tokens by using getToken(RTFParser, int) to look forward, backwards. The RTFParser also tracks the group name, which you can use to determine the location in the rtf, e.g. "rtf:fnttble:fnt" or something like this tells you, that you are in the font table of the rtf.
I am not sure this will help, but I am pretty sure, for reliably detecting lists you will need to parse RTF. Additionally you probabably need to define the criteria that you would use for detecting lists, e.g. more than one space or a tab or a line indent followed by a bullet symbol or a number or and all of that at least twice?
Mathias Mamsch, IT-QBase GmbH, Consultant for Requirement Engineering and D00RS
SystemAdmin 110000D4XK3180 Posts
Re: Detecting List2013-01-15T08:32:18ZThis is the accepted answer. This is the accepted answer.
- Mathias Mamsch 2700025PVX
> I wrote an RTF parser once to handle a complex bullet point manipulation task.
May be that the project http://nrtftree.sourceforge.net/ is a good template for this.