[dpdk-dev] [EXT] [PATCH] config: change default cache line size for ARMv8 with meson
Yongseok Koh
yskoh at mellanox.com
Mon Jan 14 05:32:32 CET 2019
> On Jan 9, 2019, at 6:57 AM, Thomas Monjalon <thomas at monjalon.net> wrote:
>
> 09/01/2019 15:23, Jerin Jacob Kollanukkaran:
>> On Wed, 2019-01-09 at 14:30 +0100, Thomas Monjalon wrote:
>>> 09/01/2019 13:47, Jerin Jacob Kollanukkaran:
>>>> On Wed, 2019-01-09 at 12:28 +0100, Thomas Monjalon wrote:
>>>>> 09/01/2019 11:49, Jerin Jacob Kollanukkaran:
>>>>>> On Wed, 2019-01-09 at 10:22 +0000, Yongseok Koh wrote:
>>>>>>> On Jan 9, 2019, at 2:09 AM, Jerin Jacob Kollanukkaran wrote:
>>>>>>>> I think, I way forward is to add
>>>>>>>> config/arm/arm64_a72_linuxapp_gcc
>>>>>>>> for meson. This config can be used for all SoC with A72
>>>>>>>> armv8
>>>>>>>> implementation and may have sym link to specfific SoC to
>>>>>>>> avoid
>>>>>>>> confusion to end users.
>>>>>>>
>>>>>>> Is config/arm/arm64_a72_linuxapp_gcc valid? Others have
>>>>>>
>>>>>> Yes. For cross compiling for A72.
>>>>>
>>>>> Any cross-compilation with meson requires a config file.
>>>>> The default Arm cross-compilation is done with
>>>>> config/arm/arm64_armv8_linuxapp_gcc
>>>>> which set implementor_id = 'generic'
>>>>>
>>>>> For native compilation, implementor_id is detected from
>>>>> /sys/devices/system/cpu/cpu0/regs/identification/midr_el1
>>>>>
>>>>> So each Arm machine needs 2 things:
>>>>> - a cross-compilation file
>>>>> - settings based on implementor_id in config/arm/meson.build
>>>>
>>>> Yes. config/arm/arm64_armv8_linuxapp_gcc sets the implementor_id =
>>>> 'generic' which assumed to generic across all the armv8 platform.
>>>> If tomorrow there is new core from ARM which A100 with armv8.2
>>>> specific
>>>> we can not tune the generic params armv8.2 as it will break other
>>>> CPU.
>>>>
>>>>
>>>>>> Having not seperate IMPLEMENTOR ID is a chip design issue.
>>>>>
>>>>> No I don't think it's a design issue.
>>>>> If the Arm core has no modification, it does not need to be
>>>>> specially identified.
>>>>
>>>> Thats right. It does not need to be specially identified,
>>>> then should have default config is enough.
>>>>
>>>>
>>>>>> I think it can work around by creating
>>>>>> config/arm/arm64_<your_soc_name>_linuxapp_gcc
>>>>>> and build on x86 or arm64 through
>>>>>>
>>>>>> meson build --cross-file
>>>>>> config/arm/arm64_<your_soc_name>_linuxapp_gcc
>>>>>
>>>>> No, it is a real A72, so it should work with default settings.
>>>>>
>>>>> The only issue we have is that the default cache line size for
>>>>> Aarch64
>>>>> is set to 128 in config/arm/meson.build, and this is wrong.
>>>>> The default cache line is 64 bits.
>>>>
>>>> The cache line size as per ARM spec it is IMPLEMENTATION DEFINED.
>>>
>>> In A72 spec, it is said
>>> "Returns 0b010 to indicate that the cache line size is 64 bytes."
>>> But I guess we cannot say it is always true for all models.
>>> So let's assume there is no default.
>>
>> Please note, A72 is not armv8 spec. A72 is just an IMPLEMENTATION of
>> armv8.
>
> Yes, this my understanding. That's why I agree with you.
>
>>>> So no default there. So the default is something work on all
>>>> platforms.
>>>> Actually Cavium has machine with 64B and 128B CL and same image
>>>> should
>>>> work on both for generic build.
>>>>
>>>>> This is already overriden for Cavium machines which have 128-bit
>>>>> cache lines.
>>>>> It may be needed to do the same change for other machines
>>>>> (Qualcomm?)
>>>>> having Arm core modified to 128-bit cache lines.
>>>>
>>>> Assume you meant 128B here.
>>>
>>> Yes, sorry I mixed bits and bytes :)
>>>
>>>> Building the image Naively(on 128B CL
>>>> machine) and cross compile (on x86) is not an issue.
>>>>
>>>>> The other concern is about running a generic Arm build.
>>>>
>>>> Yes. That's the ONLY concern.
>>>>
>>>>> Given 64-bit should be the default, generic builds will have this
>>>>> value.
>>>>> Is it a big issue for running generic 64-bit build on Cavium
>>>>> machines?
>>>>
>>>> Cavium has both 64B and 128B CL machines. So putting generic form,
>>>>
>>>> You can run 128B configured image on 64B machine, It will waste
>>>> some
>>>> memory not beyond that. Other way around will result in HW
>>>> misbehavior.
>>>> ie Running 64B CL image on 128B target.
>>>
>>> Indeed it is the main concern.
>>> Running DPDK tuned for 128 bytes on a core having 64 bytes cache line
>>> will result in lower performances. It is less an issue than HW
>>> misbehavior.
>>
>> Do you see performance issue or it more memory usage? It nothing
>> do with thread just of out curosity. Becase, our 64CL machine does
>> take more memory, performance seems to same for both. Note we are
>> using 512MB hugepage size.
>
> Yes, we see better performance with 64B cache line on Bluefield.
>
>>> If we agree to keep 128 bytes as generic cache line size for Arm,
>>> we need a way to get 64 bytes size for unmodified cores.
>>> In other words, the generic build settings must be different of
>>> the default settings.
>>
>> Please send a patch.
>>
>> If MIDR value is set to A72, we can set to 64B cache, no issue.
>>
>>> Please make a difference between default 'armv8' and 'generic'
>>> as implementor_id in config/arm/meson.build.
>>> I propose arm64_armv8_linuxapp_gcc being the default config (for
>>> armv8)
>>> and creating arm64_generic_linuxapp_gcc for the generic build (for
>>> distros).
>>
>> It should be inline with how distro guys build the image. I guess
>> we dont want DPDK to be a exception.
>
> The machine option is specific to DPDK, so we can define it as we want.
>
>> Please check below thread and patch.
>>
>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2019-January%2F122676.html&data=02%7C01%7Cyskoh%40mellanox.com%7C96576e66c4b6434b47ad08d67642be9a%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636826426371146146&sdata=MNMzpjs7e71l4vZAmoyqicpElp7UIFO48UuamggQWHQ%3D&reserved=0
>> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatches.dpdk.org%2Fpatch%2F49477%2F&data=02%7C01%7Cyskoh%40mellanox.com%7C96576e66c4b6434b47ad08d67642be9a%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636826426371146146&sdata=okIrz7Idc8t7nMbFkcRnjxZg2wMn9ZTjqaTLlEXCnaU%3D&reserved=0
>>
>> Debian folks are building like this for the _generic_ image.
>> What ever works for every distros, I am fine with that.
>>
>> meson configure -Dmachine=default
>> meson build
>> cd build
>> ninja
>> ninja install
>
> I think we agree on the idea of having different configs
> for unmodified A72 core and generic build working for all.
> The remaining bits to discuss are:
> - do we want to use the armv8 config for unmodified A72?
> - what should be the name of the generic config?
>
> When digging more the config files in meson, I found this:
> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmesonbuild.com%2FCross-compilation.html%23cross-file-locations&data=02%7C01%7Cyskoh%40mellanox.com%7C96576e66c4b6434b47ad08d67642be9a%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636826426371146146&sdata=J8XiCovgwqxm8HHRCJ5bSUbx4yTHCO2YuZz2ryZJx8I%3D&reserved=0
> It says that distros or compilers should provide some config files.
> It means we should check if some standard names are emerging
> and try to follow the same naming, or even re-use existing config files.
I'll come up with a new patch based on the discussion here.
A few things noted,
- we still want it to be 128B for generic build
- we at least agreed on changing it to 64B for A72
- As Qualcomm Centriq CPU has 128B cache line with A72, they should
create a profile in meson.build based on their impl_id.
I talked to Thomas and we'll shoot it for 19.05.
Thanks,
Yongseok
More information about the dev
mailing list