Release History
Version |
Date |
Changes |
Files |
0.3.1 |
6 Sep 2005 |
- Fixed class scope vector typedefs, missing PowerPC intrinsics header, vector initializer syntax for FSF 3.4 [ILi*].
- Added complex conj function for vec and valarray [ILi*].
- Improved valarray expression performance: v1 [slice].
- Improved valarray code generation: CSE, inlining limits, literal terms, array term elements, statarray construction, compiling -faltivec without -maltivec for Apple gcc 4.0.
- Added refarray class [PBa].
- Fixed buffer overflow in integral valarrays for SSE2; added optimizations for valarray expressions: v1 >> k and v1 << k for SSE2 [MSh].
- Fixed accumulate array dispatch, integer constant overflow, literal benchmark test for SSE2; fixed chunking iterator pessimization for gcc 3.3/4 [ILi, RBe].
- Added makefile for Linux x86 [ILi*].
- Added support for FSF gcc 3.4 on Cygwin 1.5.
- Added differently typed valarray construct and assign from terms, valarrays of sized booleans, select with sized booleans [ILi].
- Fixed unix makefile directory.
- Added macstlizer conversions: abs, abss, cmpeq, max, min.
- Improved readme file.
|
dmg (220K)
tgz (170K)
zip (232K) |
0.3 |
28 Jun 2005 |
- Added support for Apple gcc 4.0 on Mac OS X 10.4.
- Added support for FSF gcc 3.4 on Yellow Dog Linux [MLe*].
- Improved vec design and performance.
- Added support for Xcode 2.1 and Universal Binaries (Intel/PowerPC).
- Added differently typed arguments for vec and valarray functions.
- Added select function for vec and valarray [JTe].
- Fixed undefined __bool and test template template parameters for Codewarrior.
- Added macstlizer script and header; added SSE2 integer shift intrinsics.
- Improved macstlizer script; added vec <pixel, 8> for Altivec, SSE/SSE2 memory intrinsics.
- Added binary min and max for valarray [ILi].
- Fixed vector cast error in template static member functions for Apple gcc 3.3 on Mac OS X 10.3.
- Added rsqrt function for vec and valarray; added optimizations for valarray expressions: v1 / sqrt (v2) and v1 / sqrt (v2) + v3 [ABr].
- Fixed missing #include in macstlizer script.
|
dmg (226K)
tgz (184K)
zip (246K) |
0.2.2 |
28 Mar 2005 |
- Fixed error when linking more than 1 object file, template function should be inline [KHe].
- Added mulhi function for vec and valarray [MSh].
- Improved valarray expression template and iterator design.
|
dmg (193K)
tgz (151K)
zip (208K) |
0.2.1 |
14 Feb 2005 |
- Fixed member and binary min and max for vec <unsigned short, 4> [PBa].
- Fixed #include error with own projects [DCh].
- Added support for Intel ICC 8.1 [ACu].
- Fixed truncation of signed constants in unsigned parameters [DPi].
- Added partial support for IBM XLC 6.0.
- Fixed header access paths and missing functions malloc, free, vm_allocate, vm_copy, vm_deallocate for Codewarrior.
- Fixed #include <sys/mman.h> error, domain in trigonometric test for VC++. Improved inlining for ICC.
|
dmg (191K)
tgz (150K)
zip (208K) |
0.2 |
31 Jan 2005 |
- Added portable SIMD classes: partial support for MMX/SSE/SSE2/SSE3, standard initialization syntax.
- Added high-performance transcendental functions: exp, log, sin, cos, tan.
- Added fused multiply-add optimization for valarrays.
- Added complex number arithmetic.
- Added fast integer division and modulus.
- Added adapters for Core Foundation and Foundation classes for STL.
- Added memory mapped containers.
- Updated 400+ VPERM constants.
- Updated Mach specialized vector.
- Updated COM server implementation.
- Updated and added many more unit tests and benchmarks.
- Changed licensing terms.
- Now supported on Apple Xcode 1.5 + gcc 3.3, Metrowerks CodeWarrior 9.3 and Microsoft Visual C++ .NET 2003.
-
many other fixes and improvements.
|
dmg (281K)
tgz (143K)
zip (197K) |
0.1.5 |
3 Nov 2003 |
- Now builds with Xcode 1.0 [*CBe].
- Fixed altivec functions not compiling in gcc 3.1: tree_list not supported by dump_expr [JLWi].
- Reduced download size by removing build directories [ASt].
- Fixed altivec abs ambiguity.
- Fixed internal valarray #includes.
- Fixed uninitialized_copy_n not compiling with non-random access iterators.
|
sit (436K)
tgz (680K)
zip (762K) |
0.1.4 |
29 Sep 2003 |
- Added shift member to altivec class.
- Enabled generated constants only for NDEBUG builds [HBe].
- Renamed sl to operator<< and sra to operator>>.
- Fixed 1,021 constants for correctness and efficiency [*HBe].
- Fixed sqrt not compiling or returning incorrect results for CodeWarrior.
- Fixed unary functions not inlining (0.1.3).
- Fixed some binary operators not compiling for altivec boolean types (0.1.3).
- Fixed namespace of min, max altivec functions.
|
sit (1.5M)
tgz (1.9M)
zip (2.0M) |
0.1.3 |
25 Sep 2003 |
- Now builds with Metrowerks CodeWarrior 9.
- Now builds with August 2003 gcc 3.3 Updater.
- Added 65,566 generated Altivec constants, using optimal instruction sequences without memory loads [*HBe].
- Added all intrinsic Altivec operators.
- Added unit tests for operators and scalarizers against the builtin types [WTo].
- Added doxygen comments to most public classes and some internal classes.
- Added cshift member to altivec class.
- Added pixel class.
- Now works correctly as #included system headers:added include guards, fixed internal #lincludes.
- Simplified term use of template template functors, used explicit enchunking of functors.
- Removed libstdc++ dependencies: added destroy_n algorithm and identity functor, rewrote uninitialized_copy_n and copy_n, remapped type traits system.
- Fixed terms including >> and << not compiling.
- Fixed initial valarray fill not compiling if the type had a non-trivial destructor.
- Fixed valarray of bool compiling incorrectly if sizeof (bool) != 4.
- Fixed binder1st/2nd with differing argument types not compiling.
- Fixed operator- of some term iterators not compiling.
- Fixed incorrect value for chunked sum of char.
|
sit (2.2M)
zip (2.7M) |
0.1.2 |
18 Aug 2003 |
- Optimized chunked slicing of floats and longs e.g.
v1 [slice] : used Altivec permutes, 39% faster.
- Optimized chunked relational min, max e.g.
(v1 == v2).min () : used Altivec predicates, 21% faster.
- Optimized unchunked bool-valued min, max: 2.44x faster.
- Now builds correctly with Project Builder 2.1 and December 2002 gcc 3.3 Updater.
- Refactored entire term hierarchy for easier value type partial specialization.
- Allocations are now exact instead of rounded up to Altivec alignment; algorithms adjusted to use scalar code on array tails.
- Fixed bug where element access to a unary term could not compile e.g.
sin (v1) [0] .
- Fixed bug where chunked min, max or sum of boolean could not compile.
|
sit (84K)
zip (96K) |
0.1.1 |
26 Jul 2003 |
- Fixed gcc 3.3 compile issues: stricter access control, in-class init of vector constant. [ACu]
|
sit (72K)
zip (80K) |
0.1 |
9 Jul 2003 |
|
sit (70K)
zip (78K) |