Stopped using jemalloc on Linux, for better compatibility with certain libraries. (#389)
Speed up rendering of flamegraphs in cases where there are many smaller allocations, by filtering out allocations smaller than 0.2% of total memory. Future releases may re-enable showing smaller allocations if a better fix can be found. (#390)
Fix problem on macOS where certain subprocesses (e.g. from Homebrew) would fail to start from Python processes running under Fil. Thanks to @dreid for the bug report. (#230)
Fixed bug where aligned_alloc()-created allocations were untracked when using pip packages with Conda; specifically this is relevant to libraries written in C++. (#152)
Improved output in the rare case where allocations go missing. (#154)
Fixed potential problem with threads noticing profiling is enabled. (#156)
Number of allocations in the profiling results are now limited to 10,000. If there are more than this, they are all quite tiny, so probably less informative, and including massive number of tiny allocations makes report generation (and report display) extremely resource intensive. (#140)
The out-of-memory detector should work more reliably on Linux. (#144)
On Linux, use a more robust method of preloading the shared library (requires glibc 2.30+, i.e. a Linux distribution released in 2020 or later). (#133)
Fixed in regression in Fil v0.15 that made it unusable on macOS. (#135)
Fewer spurious warnings about launching subprocesses. (#136)
Allocations in C threads are now considered allocations by the Python code that launched the thread, to help give some sense of where they came from. (#72)
It’s now possible to run Fil by doing python -m filprofiler in addition to running it as fil-profile. (#82)
Small performance improvements reducing overhead of malloc()/free() tracking. (#88 and #95)
When running in Jupyter, NumPy/BLOSC/etc. thread pools are only limited to one thread when actually running a Fil profile. This means Fil’s Jupyter kernel is even closer to running the way a normal Python 3 kernel would. (#72)
Switched to using jemalloc on Linux, which should deal better both in terms of memory usage and speed with many small allocations.
It also simplifies the code. (#42)
Further reduced memory overhead for tracking objects, at the cost of slightly lower resolution when tracking allocations >2GB.
Large allocations >2GB will only be accurate to a resoluion of ~1MB, i.e. they might be off by approximately 0.05%. (#47)
Command-line arguments after the script/module now work. To make it easier to implement, changed the code so you do fil-profile run script.py instead of fil-profile script.py.