The Persistence of Vision Raytracer (POV-Ray).
This is the legacy Bug Tracking System for the POV-Ray project. Bugs listed here are being migrated to our github issue tracker. Please refer to that for new reports or updates to existing ones on this system.
Opened by Andrey Zholos - 2011-05-29
Last edited by Christoph Lipka - 2012-06-21
FS#208 - Use-after-free when returning local function or spline identifier from macro
#macro A() #local foo = function { x } foo #end #local bar = A();
This causes either a segfault, corruption detected by malloc, or “Parse Error: Unknown user defined function”.
After some debugging I think this is what happens.
In source/backend/parser/parse.cpp, Parser::Parse_RValue is called to define the value of bar. Get_Token is called, which invokes A() and which ultimately returns foo as a FUNCT_ID_TOKEN. This token is handled by CASE_VECTOR in Parse_RValue. The relevant clause calls Parse_Unknown_Vector to parse additional tokens (e.g. “foo ( 1 )”). There aren’t any other tokens, but in the process of determining that, #end is reached and Return_From_Macro destroys the symbol table of A, including foo.
So by the time the CASE_VECTOR clause decides that foo is a function identifier that should be copied, the function is destroyed (both the function itself and its number in the symbol table). So here:
Temp_Data = (void *) Copy_Identifier((void *)*Token.DataPtr,*Token.NumberPtr);
if *Token.DataPtr (in this case, a function index) was already overwritten, we get “Unknown user defined function”; if it still has the valid function number, it increments the reference count of the function (which has already been freed) back from 0, and we get a double-free later.
A similar problem occurs when foo is a spline.
A tentative patch for the function case is attached.
Thursday, 21 June 2012, 22:49 GMT
Reason for closing: Fixed
Additional comments about closing:
no news, so I guess the fix was
effective
I can only reproduce the parse error case, which can be easily worked around by using either:
or:
Of course a segfault or mess on the heap shouldn't happen. Can you come up with a test case for that?
It's kind of random, sometimes it even works. Valgrind catches it as a read from a freed block though.
This variation seems to segfault every time on i386 Linux:
Why are you using #local ?
Do you know that #local scope is limited, and that in the actual issue, you would be using it outside of the scope.
Have you tried to replace it with #declare ?
This bug is not about what the user does - it is about how POV-Ray reacts to it. If we have an SDL script that crashes POV-Ray, we have a bug to fix.
The scope of the name is local, but values (floats, vectors, or anonymous functions) are not scoped. Otherwise it wouldn't be possible to return even a vector calculated locally by a macro. #declare does work around the issue, yes.
Here's a more practical example where this issue might be encountered. It's a library function, so it uses local variable to not interfere with the main scene. It does something with the object it creates before returning it, so clipka's first workaround doesn't help (the second one does).
I'm afraid of an issue of reference count for complex objects.
SDL is not java-nuts, it does not defined a float object for each value.
Same is probably true for vector also. (just a set of 2 to 5 values)
An evaluated function is resumed to
But more complex katamary (such as spline, or other items like texture, objects, or whatever), are returned as pointer to the object.
And in case of #local, the pointed katamary is goind to be destroyed at the #end of the scope.
spline(s) performs a copy of the spline, so it's ok as the copy is per value and is not scoped by another #local.
I wonder if "function{foo}" would also copy the local function (or even work at all).
The reference count on the returned function is incremented when it is assigned to a new identifier, it just happens too late – after it is decremented when the inner scope is destroyed.
Maybe a solution is for every identifier token to own a copy of the object it references. Even if this situation is uncommon, reference counting is cheap.
Here's another fun example:
Usually gives "Parse Error: Cannot pass uninitialized identifier as macro parameter. Initialize identifier first.", but works under valgrind (after errors about invalid reads from freed blocks).
Thanks for finding the problem. Could you give the file and line numbers where the reference counting happens? - It is easier to fix the order of incrementing and decrementing the reference count, as tokens are not C++ objects in POV-Ray 3.7, and the implications of making such a change are too big to let this happen.
In source/backend/parser/parse.cpp, in Parse_RValue, under CASE_VECTOR, there's a call to Parse_Unknown_Vector (line 8541 in RC3). When this call is made, the current token is "foo" on line 3 of the original example. Parse_Unknown_Vector ultimately reaches Get_Token (in tokenize.cpp), which sees the next token "#end", and through Parse_Directive, Return_From_Macro and Destroy_Table reaches Destroy_Entry for foo. This destroys three things:
At this point the current token is still "foo".
Back in Parse_RValue, Copy_Identifier is called (line 8561). Its argument *Token.DataPtr points through the freed SYM_ENTRY and the freed FUNCTION. If those have not been overwritten, it calls Copy_Function, incrementing the reference count to 1. (By the way, I think it's worthwhile to add an assert that the reference count is positive.) In this case you get glibc errors about double frees. If FUNCTION was overwritten, you get "Parse Error: Unknown user defined function", and I guess the segfaults occur when *Token.DataPtr is overwritten to point somewhere bad.
See the patch I attached originally. I think that's the correct strategy here: save the function number locally (so as not to access it through the destroyed token) and hold an additional reference to it across the clause.
I'm not sure what to do about splines though, as they're not reference counted.
I have another approach, a bit more radical: testing if the symbol data is in the token.
Any thought on that ? (may be someone have a better message for the Error ?)
(in particular, it is wrong with the "#undef foo" scene...)
It needs just a bit more code (to handle #undef & #fclose), but it seems to works (trapping as an error).
For information, scenes/gamma/gamma_showcase.pov is returning a #local (at line 147) (that's a bad example for the innocent user, IMHO)
#5474 (and #5475 for the scene) should fix the issue (with a message of error reporting the name of the offending symbol)
with most up to date build, –benchmark produces this error:
File '/tmp/pov4488.pov' line 996: Parse Error: The symbol A would be used out of
scope (after it has been deleted).
I looked in ~source/backend/control/benchmark.cpp and taking into account the offset to the beginning of the SDL portion that puts us at line #1104
hmmm ... line 1104 doesn't make any sense. The 1st occurrence of "A" is on line 1217
That issue (benchmark) is tracked in
FS#218.The problem with the benchmark.cpp (line numbering/offset) is that some lines count for more than one. It's difficult to correlate.
The problem always (and only) appears whenever the parser encounters a #local symbol right at the end of a macro (or include file), and expects some "parameters" to that symbol. To my knowledge, this happens in the following situations:
Anyone know any other similar situation, or can confirm that there is no other such situation?
My current build is throwing this error when running the built-in benchmark:
File '/tmp/pov18218.pov' line 996: Parse Error: The symbol A would be used out of scope (after it has been deleted).
The first occurrence of "A" is on line 1217 of ~source/backend/control/benchmark.cpp which seems a bit different than the conditions you have asked about. Although it fit's the description of item #3 above, it isn't /exactly/ at the end of the macro, so I thought it was at least mentioning the case I'm seeing ... 1000 pardons if situation fit's the known failure mechanism.
Single value and vectors are ok.
All others, if they can appear in an expression, are doomed.
I think strings might get the same way, even without expression ?
Object id are also subject to deletion, unless cloned with "object {Local_Object}" (same as spline's workaround)
#macro aa()
#local kk="hello";
kk
#end
#debug aa()
Might I suggest that you test with a destructive free implementation (alas, the povmem.c code seems out of order): a free()/POVFREE() which erase the released data block.
Indeed, a memory debugging should allow locating the problem. While the old povmem code was not thread-safe, it worked just fine in 3.6 and earlier. So tracking back through versions (back to 3.5 might be necessary) might reveal when this particular deletion sequence was broken. It should also reveal what needs to be changed to fix this again. It most certainly worked properly the last time I used memory debugging, which iirc was sometime in the 3.5 development cycle.
change #5487 tackles the issue in a different manner: Rather than detecting when an out-of-scope error has already happened, reference counting is now employed to defer deletion of out-of-scope symbols' contents until they're no longer needed. (The symbols are "un-hooked" from the symbol table for good, but their contents may live a bit longer.) This should (at least in theory) now allow for clean "returning" of local variables from macros without limitations or cumbersome (and possibly inefficient) syntax constructs.
Current sources at #5488, on Ubuntu 11.04 64 bits.
Enforced sync (forced replace of already present files), just to be sure.
I'm sorry. But I have to object as so far, it is not perfect. (more work needed, it seems)
using : valgrind –leak-check=full –log-file=n37.txt povray -Is.pov -H100 -W640 +KFF2 +WT1 +KFI0 +KI0 +KF2
s.pov being the spline code (see message from Andrey Zholos, 31 May 2011, or below), it fails.
If I do not push in valgrind, it renders.
Could we avoid that double-release, please ?
(so that tracking memory leak stays possible, thanks)
In fact, even using: export MALLOC_CHECK_=1
is enough to trigger some error in povray -Is.pov ...
[Parsing...]
File 's.pov' line 10: Possible Parse Error: Unknown spline type found.
Possible Parse Error: Attempt to free NULL pointer.
Fatal error in parser: Uncategorized error.
Render failed
can you please replace the following code in pov_free() (pov_mem.cpp):
with:
and run with valgrind so that we get to know where this actually happens?
... or just try again with change #5489. Stupid me - forgot to properly initialize the reference count for the splines. Ouch.
Warning is out of scope, so I cheated with a sprintf..
(File: math/splines.cpp Line: 653).
Testing #5494, now ok.