Topic
  • 7 replies
  • Latest Post - ‏2014-09-05T17:10:21Z by AdhemervalZanella2
ThinkOpenly
ThinkOpenly
45 Posts

Pinned topic Application use of Power8 on-chip accelerators

‏2014-08-29T23:13:31Z |

The IBM Redbook entitled "Performance Optimization and Tuning Techniques for IBM Processors, including IBM POWER8" mentions:

On-chip accelerators, including on-chip encryption, compression, and random number
generation accelerators

Are there application-level APIs for exploiting these functions?

Updated on 2014-08-29T23:13:44Z at 2014-08-29T23:13:44Z by ThinkOpenly
  • Bill_Buros
    Bill_Buros
    177 Posts

    Re: Application use of Power8 on-chip accelerators

    ‏2014-09-03T18:23:53Z  

    A good question..    we'll poke around on that

    For readers - the Redbook referenced is easily found with search engines, but here's the direct link

    http://www.redbooks.ibm.com/abstracts/sg248171.html?Open

     

  • AdhemervalZanella2
    AdhemervalZanella2
    6 Posts

    Re: Application use of Power8 on-chip accelerators

    ‏2014-09-03T20:31:50Z  

    Hi

    "On-chip accelerators, including on-chip encryption, compression, and random number"

    The ones provided by new ISA 2.07 hardware instructions, the more straightforward way to use them is through compiler builtints. Newer GCC version (4.8 and 4.9) have the new compiler directives [1] and you can use as __builtin_<instruction>.

    For more high level API, there is some attempts to add these instruction usage on OpenSSL [2], however latest version (1.0.0-i) still does not contain such changes. I don't know exactly the status of this patch submission. I know we have plans to submit same optimization to other projects (gnutls).

    IBM J9 already supports it and it is enable by default where applicable.

     

    Now, there is the POWER7+ offchips accelerators that are handled only through hypervisor calls. I know there is support for PowerVM already available, however I think support it is pending for PowerVM and PowerKVM. However, since they are only handled by kernel, its usage is limited (IPSEC, disk encription and zswap I think).

     

    They are the POWER7+ offchips accerators 

    [1] https://gcc.gnu.org/ml/gcc-patches/2013-05/msg01122.html

    [2] http://openssl.6102.n7.nabble.com/PATCH-0-4-Initial-POWER8-support-td47409.html

  • AdhemervalZanella2
    AdhemervalZanella2
    6 Posts

    Re: Application use of Power8 on-chip accelerators

    ‏2014-09-04T12:40:08Z  

    Hi

    "On-chip accelerators, including on-chip encryption, compression, and random number"

    The ones provided by new ISA 2.07 hardware instructions, the more straightforward way to use them is through compiler builtints. Newer GCC version (4.8 and 4.9) have the new compiler directives [1] and you can use as __builtin_<instruction>.

    For more high level API, there is some attempts to add these instruction usage on OpenSSL [2], however latest version (1.0.0-i) still does not contain such changes. I don't know exactly the status of this patch submission. I know we have plans to submit same optimization to other projects (gnutls).

    IBM J9 already supports it and it is enable by default where applicable.

     

    Now, there is the POWER7+ offchips accelerators that are handled only through hypervisor calls. I know there is support for PowerVM already available, however I think support it is pending for PowerVM and PowerKVM. However, since they are only handled by kernel, its usage is limited (IPSEC, disk encription and zswap I think).

     

    They are the POWER7+ offchips accerators 

    [1] https://gcc.gnu.org/ml/gcc-patches/2013-05/msg01122.html

    [2] http://openssl.6102.n7.nabble.com/PATCH-0-4-Initial-POWER8-support-td47409.html

    Hi,

    I just checked with the developer that pushed initial optimization for OpenSSL and he has told me the POWER8 support is already upstream, but there is not 'official' release yet from OpenSSL with the optimizations.

    The plan is to backport these optimizations to upcoming distro releases, and initial plan is to support on RHEL 7.1, Ubuntu 14.10 (he is not sure about SLES12). It will have support for both:

    * AES 128/192/256 (CTR, GCM, CBC, CTR)

    * SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224, SHA-512/256

    The CRC32 support is not yet implemented in any library.

  • ThinkOpenly
    ThinkOpenly
    45 Posts

    Re: Application use of Power8 on-chip accelerators

    ‏2014-09-04T18:02:53Z  

    Hi

    "On-chip accelerators, including on-chip encryption, compression, and random number"

    The ones provided by new ISA 2.07 hardware instructions, the more straightforward way to use them is through compiler builtints. Newer GCC version (4.8 and 4.9) have the new compiler directives [1] and you can use as __builtin_<instruction>.

    For more high level API, there is some attempts to add these instruction usage on OpenSSL [2], however latest version (1.0.0-i) still does not contain such changes. I don't know exactly the status of this patch submission. I know we have plans to submit same optimization to other projects (gnutls).

    IBM J9 already supports it and it is enable by default where applicable.

     

    Now, there is the POWER7+ offchips accelerators that are handled only through hypervisor calls. I know there is support for PowerVM already available, however I think support it is pending for PowerVM and PowerKVM. However, since they are only handled by kernel, its usage is limited (IPSEC, disk encription and zswap I think).

     

    They are the POWER7+ offchips accerators 

    [1] https://gcc.gnu.org/ml/gcc-patches/2013-05/msg01122.html

    [2] http://openssl.6102.n7.nabble.com/PATCH-0-4-Initial-POWER8-support-td47409.html

    Specifically regarding compression, is that available to applications in any way?

  • AdhemervalZanella2
    AdhemervalZanella2
    6 Posts

    Re: Application use of Power8 on-chip accelerators

    ‏2014-09-05T12:08:29Z  

    Specifically regarding compression, is that available to applications in any way?

    For compression acceleration there is the IBM Power7+ in-Nest chip that uses a specialized algorithm named 824.  As stated by the initial patchset to enabled it on Linux [1], it has limits on generic compression and thus are focused on kernel areas where the hypervisor latency and size justify its usage. Currently it has usage on in-kernel memory compression to VM consolidation, an information on how to enabled/evaluate it can be found at [2] and [3].

    Now, I don't think there is any effort of project to provide userlevel access to such chip, since it uses a non-standard algorithms, have limitations on input sizes, you need kernel bridge to access the chip facilities, and it is requires a hypervisor call (increasing latency).

     

    [1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2012-July/099513.html

    [2] https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/W51a7ffcf4dfd_4b40_9d82_446ebc23c550/page/Build%20F17%20with%20Memory%20Compression

    [3] https://www.ibm.com/developerworks/mydeveloperworks/blogs/fe313521-2e95-46f2-817d-44a4f27eba32/entry/new_linux_zswap_compression_functionality7?lang=en

  • ThinkOpenly
    ThinkOpenly
    45 Posts

    Re: Application use of Power8 on-chip accelerators

    ‏2014-09-05T16:31:53Z  

    For compression acceleration there is the IBM Power7+ in-Nest chip that uses a specialized algorithm named 824.  As stated by the initial patchset to enabled it on Linux [1], it has limits on generic compression and thus are focused on kernel areas where the hypervisor latency and size justify its usage. Currently it has usage on in-kernel memory compression to VM consolidation, an information on how to enabled/evaluate it can be found at [2] and [3].

    Now, I don't think there is any effort of project to provide userlevel access to such chip, since it uses a non-standard algorithms, have limitations on input sizes, you need kernel bridge to access the chip facilities, and it is requires a hypervisor call (increasing latency).

     

    [1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2012-July/099513.html

    [2] https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/W51a7ffcf4dfd_4b40_9d82_446ebc23c550/page/Build%20F17%20with%20Memory%20Compression

    [3] https://www.ibm.com/developerworks/mydeveloperworks/blogs/fe313521-2e95-46f2-817d-44a4f27eba32/entry/new_linux_zswap_compression_functionality7?lang=en

    I notice you only mentioned Power7+. Power8 also has compression accelerator, no?

    Also, given the ability to run non-virtualized (https://www.ibm.com/developerworks/community/blogs/fe313521-2e95-46f2-817d-44a4f27eba32/entry/ubuntu_14_04_on_powernv), does it become a bit more practical due to avoiding the hypervisor call?

  • AdhemervalZanella2
    AdhemervalZanella2
    6 Posts

    Re: Application use of Power8 on-chip accelerators

    ‏2014-09-05T17:10:21Z  

    I notice you only mentioned Power7+. Power8 also has compression accelerator, no?

    Also, given the ability to run non-virtualized (https://www.ibm.com/developerworks/community/blogs/fe313521-2e95-46f2-817d-44a4f27eba32/entry/ubuntu_14_04_on_powernv), does it become a bit more practical due to avoiding the hypervisor call?

    POWER8 does have the same POWER7+ crypto, compression, and random number generators. However in on non-virtualized mode (NM) offload accelerators are accessed through firmware, which in this case would be Opal instead of Phyp. And Opal only seems to provide access to random number generator [1] if I'm reading its code correct.

    I am not sure what is the plan to support crypto and/or acceleration for NV environments, however even for such cases I would expect support to be similar for PowerVM: in-kernel utilization for tasks like memory compression (zswap), and network encryption (ipsec, etc).

    [1] https://github.com/open-power/skiboot/blob/master/hw/nx.c