Different user, use different methods to make a list. Some uses doors bullets, some use tab with ascii bullets etc.
This inconsistency makes absolute detection of a list a challenge. Has anyone developed a dxl that has a high probability of detecting the various forms of list that they are willing to post?
This topic has been locked.
3 replies Latest Post - 2013-01-15T08:32:18Z by SystemAdmin
Pinned topic Detecting List
Answered question This question has been answered.
Unanswered question This question has not been answered yet.
Updated on 2013-01-15T08:32:18Z at 2013-01-15T08:32:18Z by SystemAdmin
llandale 270001QM9N2939 PostsACCEPTED ANSWER
Re: Detecting List2013-01-07T15:01:19Z in response to OurGuestNo, but I recently had a similar issue trying to find headings in Object Text. I wonder if the following is useful. Looking at raw text...
- is there more than 2 EOLs
- is there a TAB in the first few characters
- is the first few characters dominated with white-space
- does a RichTextParagraph have characteristic "bullet"
Mathias Mamsch 2700025PVX1910 PostsACCEPTED ANSWER
Re: Detecting List2013-01-11T10:59:55Z in response to OurGuestI wrote an RTF parser once to handle a complex bullet point manipulation task. At the moment this one handles fi, li and pard tags to track the indentation of the paragraph, which is probably an important point for detecting lists. You can use the code with a loop that handles certains states, e.g. I am at the begin of a paragraph, etc. I always wanted to put a high level interface (flow document) over it, but I never came around to do so.
Would you be interested in having that RTF parser open source? Still looking for people who would be willing to commit to an open source DXL library. Sorry, that the library is not well commented, but I guess you will get the functionality pretty fast. You can use the nextToken() function in a loop to parse the RTF and then roam around in the tokens by using getToken(RTFParser, int) to look forward, backwards. The RTFParser also tracks the group name, which you can use to determine the location in the rtf, e.g. "rtf:fnttble:fnt" or something like this tells you, that you are in the font table of the rtf.
I am not sure this will help, but I am pretty sure, for reliably detecting lists you will need to parse RTF. Additionally you probabably need to define the criteria that you would use for detecting lists, e.g. more than one space or a tab or a line indent followed by a bullet symbol or a number or and all of that at least twice?
Mathias Mamsch, IT-QBase GmbH, Consultant for Requirement Engineering and D00RS
SystemAdmin 110000D4XK3180 PostsACCEPTED ANSWER
Re: Detecting List2013-01-15T08:32:18Z in response to Mathias MamschHello Mathias
> I wrote an RTF parser once to handle a complex bullet point manipulation task.
May be that the project http://nrtftree.sourceforge.net/ is a good template for this.