RE: [Trex-devel] Some explanation - Trex-devel - lists.informatik.uni-goettingen.de

18 Jul 2006


      Hello Benjamin,
...
-----Original Message-----
From: trex-devel-bounces@informatik.uni-goettingen.de 
[mailto:trex-devel-bounces@informatik.uni-goettingen.de] On 
Behalf Of Benjamin Zeiss
Sent: 2006. július 18. 13:23
To: TRex Development Mailing-List
Subject: Re: [Trex-devel] Some explanation
Hello Kristof,
Kristóf Szabados (IJ/ETH) wrote:
...
...
No I was thingink on creating the nodes of the tree by
hand. I think
...
...
it will become a problem later on if we will have use
getchild().getchild() and if that's not working than it's
siblings
...
...
kind of programming. In a long term it should be better
to program
...
...
like
get_the_base_template().iterate_over_the_formal_parameters_until().
...
...
Well this is not a good example (long name, wrong
structure), but it
...
...
demonstrates how much easyer it is to maintain a code
written that
...
...
way, compared to the other general tree handling. (This is also a 
working method proven by TITAN, it is somewhere between the fully 
general parsing of ANTLR and the hand crafted parsers of
gcc, JDT,
...
...
CDT and other commercial compilers).
Yes, a class based data model for the source would certainly 
be nice and easier to use. Implementing this later on is 
indeed not that simple.
However, i can imagine that it is possible to introducing a 
data-model facade based on the LocationAST that is doing 
something like that. That way, new code could use this facade 
and old code can be rewritten to use it as time permits. 
However, i also believe that the current AST approach can 
make some things easier. For example, i would imagine that 
the reference finder algorithm could not be realized in such 
a generic way that easily. But these things always depend on 
details in the implementation and design.
Well I never said it is easy. I know it takes a lot of time just to hand create the classes.
Actually we are planning to support other languages, and lots of deviations (ericsson creates lots of change requests for the standard), so we must know what paths we are able to choose. For us it might be much better to spend lots of time in the beginning to create the foundations, than to correct mistakes later on.
But if we create our own AST structure, it might become incompatible with your refactoring features.
...
...
...
There is also an other problem of Trex's huge memory requirement. 
For the SIP protocol's files (which I assume you have)
TITAN's peak
...
...
memory usage only reaches 23 MB in the 3 seconds of parsing, 
semantic checkin, code generation, and runnable generation, while 
Trex uses more than 100 MBs for quite a few seconds to make the 
parsing, and metric analyzis (0.5.2 and the memory usage
is constant
...
...
because of the nature of the problem).
The metric analysis is definately expensive performance wise. 
The latest builds offer the possibility to disable it through 
the project property page. The parser speed and memory 
footprint can probably still be tuned a lot - that is 
correct, but it has not been the focus of our work in 
Göttingen. For a really good solution, i would expect it 
makes sense to go the JDT route and have a lightweight AST 
for IDE stuff and a non-lightweight AST for possible 
compilation and so on. Also, note that we currently analyze a 
project very naively, i.e. we don't analyze dependencies 
between TTCN-3 modules, but we load all TTCN-3 files in a 
project. Hence, loading times are high the first time on 
startup and unnecessary files may get loaded.
Well I was also talking about loading all the modules at once. I understand that the basic idea was a refactoring tool, and that for reseach the tool is already quite good. But here I have seen projects containing >10 MB of source code. If the tool won't be able to handle this ammount well, then people naturally won't use it ( I too dont like to wait for the appearance of the letter I just now typed). The parsing speed is only one issue, we can help on that a lot with "rare" parsing (like only at the end of words, or re-parsing only in the context of the change). Memory is an other issue, I understand that Java has it's limitations, but as soon as we reach the limits of phisical memory, virtual memory kicks in, decreasing the performance considerably (you should know that we are of course working on encrypted drives for security reasons and that is slowing down things a bit).
We were already thingink on allowing the user to choose from fast but not so precise and slow but precise ways of parsing / AST handling, but this will also need some further thoughts.
...
...
...
There is also an other problem
that works for Trex's huge memory usage, if a file was
edited, than
...
...
a new tree is built while keeping the old tree . I
understand that
...
...
on this way you can support content assistant on
non-correct files,
...
...
but this doubles the memory needed temporarily, it might
be better
...
...
to re-parse only a sub tree (but I also understand that
that would
...
...
be quite a lot of work).
Also true. It is probably not as hard as you think. At least 
i have some rough ideas how i would try to approach it and 
this would involve maybe a few days to realize.
We basically knew about all these problems, but you should 
keep in mind that the TRex is the result of few persons who 
have worked on it or are still working on it only for their 
bachelor and master's thesises.
I understand that, TITAN also started as a master's thesises of our lead designer.
...
Also it may currently seem 
like we are working on TRex full-time, but actually we aren't 
 ;-)  The things you mention are problematic for productive 
industrial use, but they are not as important for research.
Hence, these things are somewhere on our list, but they have 
no immediate priority. We would expect that those things get 
addressed by those who need them improved and we would 
certainly assist if there are any questions. Otherwise, they 
will be addressed when our time permits, but this may not be 
very soon. You should not forget that we wrote everything 
from scratch in maybe a year or less with mostly only one or 
two persons implementing at a time. Therefore, it shouldn't 
be a surprise that we cannot offer a product which is ready 
for commercial use.
Well we understand that we will have to put in considerable ammount of "resources" to make it commercial quality. Right now we are just deciding if we should do that actualy, and if so than how.
...
--
Benjamin Zeiss
_______________________________________________
Trex-devel mailing list
Trex-devel@informatik.uni-goettingen.de
https://user.informatik.uni-goettingen.de/mailman/listinfo/trex-devel
Kristof