2024-05-16

retro warning message

1	[W NNPACK.cpp:61] Could not initialize NNPACK! Reason: Unsupported hardware.

https://github.com/pytorch/pytorch/blob/70f4b3551c01230d4ab00da7bf453fa7c6b14eb9/aten/src/ATen/native/NNPACK.cpp#L52-L72

https://discuss.pytorch.org/t/bug-w-nnpack-cpp-80-could-not-initialize-nnpack-reason-unsupported-hardware/107518/23

Environment	Architecture	CPU requirements
Linux	x86-64	AVX2 and 3-level cache hierarchy

$ lscpu
Architecture:            x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Address sizes:         36 bits physical, 48 bits virtual
  Byte Order:            Little Endian
CPU(s):                  8
  On-line CPU(s) list:   0-7
Vendor ID:               GenuineIntel
  Model name:            Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
    CPU family:          6
    Model:               42
    Thread(s) per core:  2
    Core(s) per socket:  4
    Socket(s):           1
    Stepping:            7
    CPU max MHz:         3800.0000
    CPU min MHz:         1600.0000
    BogoMIPS:            6784.75
    Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mc
                         a cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht
                         tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon p
                         ebs bts rep_good nopl xtopology nonstop_tsc cpuid aperf
                         mperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est t
                         m2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcn
                         t tsc_deadline_timer aes xsave avx lahf_lm epb pti ssbd
                          ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid
                         xsaveopt dtherm ida arat pln pts md_clear flush_l1d
Virtualization features:
  Virtualisation:        VT-x
Caches (sum of all):
  L1d:                   128 KiB (4 instances)
  L1i:                   128 KiB (4 instances)
  L2:                    1 MiB (4 instances)
  L3:                    8 MiB (1 instance)
NUMA:
  NUMA node(s):          1
  NUMA node0 CPU(s):     0-7
Vulnerabilities:
  Gather data sampling:  Not affected
  Itlb multihit:         KVM: Mitigation: VMX disabled
  L1tf:                  Mitigation; PTE Inversion; VMX conditional cache flushe
                         s, SMT vulnerable
  Mds:                   Mitigation; Clear CPU buffers; SMT vulnerable
  Meltdown:              Mitigation; PTI
  Mmio stale data:       Unknown: No mitigations
  Retbleed:              Not affected
  Spec rstack overflow:  Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
                          and seccomp
  Spectre v1:            Mitigation; usercopy/swapgs barriers and __user pointer
                          sanitization
  Spectre v2:            Mitigation; Retpolines; IBPB conditional; IBRS_FW; STIB
                         P conditional; RSB filling; PBRSB-eIBRS Not affected; B
                         HI Not affected
  Srbds:                 Not affected
  Tsx async abort:       Not affected

AVX2 (Advanced Vector Extensions 2) is an expansion of the AVX instruction set introduced by Intel and AMD. It provides additional instructions to accelerate integer vector operations. AVX2 builds upon the foundation of AVX, offering higher performance and efficiency for certain computational workloads.

AVX2 introduces several new features, including:

Support for 256-bit integer vector operations: AVX2 extends the width of vector registers from 128 bits to 256 bits, allowing for more data to be processed simultaneously.
New integer vector instructions: AVX2 adds new instructions for integer vector arithmetic, bitwise operations, and vector shuffles, providing - enhanced capabilities for tasks such as image processing, cryptography, and scientific computing.
Enhanced gather and scatter operations: AVX2 includes new instructions for indexed memory operations, enabling more efficient data movement - between memory and vector registers.

CPUs that support AVX2 typically belong to newer generations and include various Intel and AMD processors released after 2013. Some examples of CPUs that support AVX2 include:

Intel Haswell (4th generation Core processors) and newer.
Intel Broadwell, Skylake, Kaby Lake, Coffee Lake, Comet Lake, and later microarchitectures.
AMD Ryzen processors (starting with the first-generation Ryzen CPUs).
AMD Ryzen Threadripper processors.
AMD EPYC processors.

These processors offer improved performance for workloads that can leverage AVX2 instructions, making them well-suited for tasks involving heavy computational workloads, such as machine learning, numerical simulations, and media processing.