We create a wrapper class acting as a layout for device mesh so that we can add new methods more specific to DeviceMesh and keep the core logic of CuTe manipulation inside pycute module. This PR create the main body of the code and then next PR will come with actual implementation and unit test for device mesh layout. (Actual implementation can be found in https://github.com/pytorch/pytorch/pull/161016)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162414
Approved by: https://github.com/ezyang
ghstack dependencies: #162413, #162534