Port Intel x86-64 intrinsic function to RISC-V or ARM

daisukeokaoss

岡 大輔(Daisuke Oka)

Posted on April 10, 2023

Port Intel x86-64 intrinsic function to RISC-V or ARM

I am researching way to port Intel x86-64 intrinsic functions to RISC-V or ARM.This research is solution to CPU architecture dependency problem.
Intel x86-64 is CISC(Complex Instruction Set Computer) and very long history.Once the code runs on x86-64 computer it must run forever.So it carries a lot of heritage.CISC like Intel is converting CISC variable length instruction to RISC like micro code. So it need converting circuit and die size becomes large.So there is overhead like consumption of electricity becomes large.
On the other hand,ARM and RISC-V is same as means of if code runs these machine,it must run forever but Instruction Set is simple and it can use relatively new technology.
But the NO 1 of market share of super computer or PC or server is Intel because many Linux application is made for Intel.
So if application that runs only on Intel can run on RISC-V or ARM by very optimized way,it may be very advantagerous.

So if we want to run application for Intel x86-64 on RISC-V or ARM by very optimized way, it need to port Intel intrinsic function to RISC-V or ARM.
the document shown below are very helpful.

https://openpowerfoundation.org/specifications/vectorintrinsicportingguide/

Intel Intrinsic API provide Instruction Set Extension Intel continue to provide.SIMD(Single Instruction Stream Multiple Data Stream) is included.

To port Intel x86-64 function to RISC-V like IBM POWER takes specific wrap structure like below.

extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__,__artificial__))
_mm_add_pd (__m128d __A, __m128d __B)
{
   return (__m128d) ((__v2df)__A + (__v2df)__B);
}
Enter fullscreen mode Exit fullscreen mode

_mm_add_pd is Intel Intrinsic function and this function add __A and __B. Intel Intrinsic function runs RISC-V like IBM POWER by adding this code.

We show other example.

extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__,__artificial__))
_mm_set1_pd (double __F)
{
   return __extension__ (__m128d){ __F, __F };
}
Enter fullscreen mode Exit fullscreen mode

this copy __F value and store to __m128m by vector format.

I am planning to make test framework to content of these wrap structure.
These port must be validated and make sure it is correct.
So we input Intel and RISC-V like IBM POWER or ARM the same value and make sure output is same.

I name this test framework as Akari. Meaning Light in Japanese.

https://www.slideshare.net/OkaDaisuke/testing-framework-to-port-and-optimize-simd-library-to-open-power-systems

I presented OpenPower summit 2021 NA.

💖 💪 🙅 🚩
daisukeokaoss
岡 大輔(Daisuke Oka)

Posted on April 10, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related