2024-06-29 Kim Walisch  <kim.walisch@gmail.com>

  Version 3.1

  * Improve AVX512 algorithm for trailing 64 bytes.
  * AVX512 algorithm does not require AVX512-BITALG extension anymore.

2024-06-27 Kim Walisch  <kim.walisch@gmail.com>

  Version 3.0

  * Add ARM SVE algorithm.
  * Replace AVX512BW algorithm by faster AVX512 VPOPCNTDQ algorithm.
  * Add MSVC support for ARM NEON.
  * Improve preprocessor checks using __has_include() macro.
  * Port tests from AppVeyor to GitHub actions.
  * Get rid of unaligned uint64_t memory acceses, this fixes
    test failures when using GCC compiler sanitizers.
  * Prefix all libpopcnt macros using LIBPOPCNT_ to avoid any naming collisions.

2020-08-10 Kim Walisch  <kim.walisch@gmail.com>

  * Use unaligned memory accesses to improve performance.
    Aligning memory causes branch mispredictions which
    can significantly deteriorate performance especially
    for small array sizes.
  * On x86/x64 runtime checks are now removed if the user
    compiles his code with e.g. -mpopcnt, -mavx2,
    -march=native, ...

2020-07-19 Kim Walisch  <kim.walisch@gmail.com>

  * Enable AVX2, AVX512 by default for MSVC.
    Thanks to KOConchobhair for the pull request #15.

2019-12-31 Kim Walisch  <kim.walisch@gmail.com>

  Version 2.3 released.

  * Up to 10% speedup on ARM NEON.
  * Fix unaligned memory access on ARM.

2018-01-13 Kim Walisch  <kim.walisch@gmail.com>

  Version 2.2 released.

  * Up to 6x faster on old x86 CPUs without POPCNT.

2017-11-10 Kim Walisch  <kim.walisch@gmail.com>

  Version 2.1 released.

  * Up to 20% ARM NEON speedup.
  * test/test1.cpp: Add only 1 bits test case.
  * test/test2.c: Add only 1 bits test case.

2017-10-27 Kim Walisch  <kim.walisch@gmail.com>

  Version 2.0 released.

  * Add AVX512 support.
  * Support AVX2 and AVX512 for the MSVC compiler.
  * Support CPUID for MSVC and the C programming language.

2017-09-09 Kim Walisch  <kim.walisch@gmail.com>

  Version 1.9 released.

  * libpopcnt.h: Fix MSVC POPCNT detection.

2017-04-24 Kim Walisch  <kim.walisch@gmail.com>

  Version 1.8 released.

  * Fixed "illegal instruction" runtime crash with GCC 4.4.7.
  * benchmark.cpp: Print statistics and algorithm name.
  * Reduce CMake minimum version to 2.8.

2017-04-08 Kim Walisch  <kim.walisch@gmail.com>

  Version 1.7 released.

  * Refactor ARM NEON popcount algorithm.

2017-04-04 Kim Walisch  <kim.walisch@gmail.com>

  Version 1.6 released.

  * Add ARM NEON popcount algorithm.
  * Fix x86 segmentation fault for CPUs without POPCNT.

2017-04-02 Kim Walisch  <kim.walisch@gmail.com>

  Version 1.5 released.

  * libpopcnt.h now supports C and C++!
  * Use CMake build system.
  * test2.c: Add C test.
  * Fix clang-cl bug on Windows.

2017-04-01 Kim Walisch  <kim.walisch@gmail.com>

  Version 1.4 released.

  * libpopcnt.h: Fix compiler warning.

2017-03-30 Kim Walisch  <kim.walisch@gmail.com>

  Version 1.3 released.

  libpopcnt.h is now C++ only (previously C/C++), the reason being
  that the cpuid check cannot be made thread-safe using plain C,
  whereas in C++ its trivial (Meyers Singleton).

  * Add benchmark.cpp.
  * Use AVX2 for arrays >= 512 bytes (previously 1024).
  * Improve Makefile.

2017-03-26 Kim Walisch  <kim.walisch@gmail.com>

  Version 1.2 released.

  * Add cpuid check for x86 CPUs.
  * Compiles without -mpopcnt, -mavx2 flags.
  * Successfully tested on IBM POWER8 (generates popcntd, GCC 5.4).
  * Successfully tested using clang-cl (Windows).
  * Add ChangeLog.
  * Update README.md.
