diff options
| author | Dave Airlie <airlied@redhat.com> | 2026-01-28 13:35:17 +1000 |
|---|---|---|
| committer | Dave Airlie <airlied@redhat.com> | 2026-01-28 13:35:23 +1000 |
| commit | 15392f76405ecb953216b437bed76ffa49cefb7b (patch) | |
| tree | 475b2e745389418a67ccc3066e8bf9354fc536c2 | |
| parent | 205bd15619322a1429c1bf53831a284a12b25e2a (diff) | |
| parent | cea7b66a80412e2a5b74627b89ae25f1d0110a4b (diff) | |
Merge tag 'drm-rust-next-2026-01-26' of https://gitlab.freedesktop.org/drm/rust/kernel into drm-next
DRM Rust changes for v7.0-rc1
DRM:
- Fix documentation for Registration constructors.
- Use pin_init::zeroed() for fops initialization.
- Annotate DRM helpers with __rust_helper.
- Improve safety documentation for gem::Object::new().
- Update AlwaysRefCounted imports.
MM:
- Prevent integer overflow in page_align().
Nova (Core):
- Prepare for Turing support. This includes parsing and handling
Turing-specific firmware headers and sections as well as a Turing
Falcon HAL implementation.
- Get rid of the Result<impl PinInit<T, E>> anti-pattern.
- Relocate initializer-specific code into the appropriate initializer.
- Use CStr::from_bytes_until_nul() to remove custom helpers.
- Improve handling of unexpected firmware values.
- Clean up redundant debug prints.
- Replace c_str!() with native Rust C-string literals.
- Update nova-core task list.
Nova (DRM):
- Align GEM object size to system page size.
Tyr:
- Use generated uAPI bindings for GpuInfo.
- Replace manual sleeps with read_poll_timeout().
- Replace c_str!() with native Rust C-string literals.
- Suppress warnings for unread fields.
- Fix incorrect register name in print statement.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: "Danilo Krummrich" <dakr@kernel.org>
Link: https://patch.msgid.link/DFYW1WV6DUCG.3K8V2DAVD1Q4A@kernel.org
30 files changed, 822 insertions, 431 deletions
diff --git a/Documentation/gpu/nova/core/todo.rst b/Documentation/gpu/nova/core/todo.rst index 35cc7c31d423..d1964eb645e2 100644 --- a/Documentation/gpu/nova/core/todo.rst +++ b/Documentation/gpu/nova/core/todo.rst @@ -41,8 +41,15 @@ trait [1] from the num crate. Having this generalization also helps with implementing a generic macro that automatically generates the corresponding mappings between a value and a number. +FromPrimitive support has been worked on in the past, but hasn't been followed +since then [1]. + +There also have been considerations of ToPrimitive [2]. + | Complexity: Beginner | Link: https://docs.rs/num/latest/num/trait.FromPrimitive.html +| Link: https://lore.kernel.org/all/cover.1750689857.git.y.j3ms.n@gmail.com/ [1] +| Link: https://rust-for-linux.zulipchat.com/#narrow/channel/288089-General/topic/Implement.20.60FromPrimitive.60.20trait.20.2B.20derive.20macro.20for.20nova-core/with/541971854 [2] Generic register abstraction [REGA] ----------------------------------- @@ -134,21 +141,6 @@ A `num` core kernel module is being designed to provide these operations. | Complexity: Intermediate | Contact: Alexandre Courbot -IRQ abstractions ----------------- - -Rust abstractions for IRQ handling. - -There is active ongoing work from Daniel Almeida [1] for the "core" abstractions -to request IRQs. - -Besides optional review and testing work, the required ``pci::Device`` code -around those core abstractions needs to be worked out. - -| Complexity: Intermediate -| Link: https://lore.kernel.org/lkml/20250122163932.46697-1-daniel.almeida@collabora.com/ [1] -| Contact: Daniel Almeida - Page abstraction for foreign pages ---------------------------------- @@ -161,40 +153,16 @@ There is active onging work from Abdiel Janulgue [1] and Lina [2]. | Link: https://lore.kernel.org/linux-mm/20241119112408.779243-1-abdiel.janulgue@gmail.com/ [1] | Link: https://lore.kernel.org/rust-for-linux/20250202-rust-page-v1-0-e3170d7fe55e@asahilina.net/ [2] -Scatterlist / sg_table abstractions ------------------------------------ - -Rust abstractions for scatterlist / sg_table. - -There is preceding work from Abdiel Janulgue, which hasn't made it to the -mailing list yet. - -| Complexity: Intermediate -| Contact: Abdiel Janulgue - PCI MISC APIs ------------- -Extend the existing PCI device / driver abstractions by SR-IOV, config space, -capability, MSI API abstractions. - -| Complexity: Beginner +Extend the existing PCI device / driver abstractions by SR-IOV, capability, MSI +API abstractions. -XArray bindings [XARR] ----------------------- - -We need bindings for `xa_alloc`/`xa_alloc_cyclic` in order to generate the -auxiliary device IDs. - -| Complexity: Intermediate +SR-IOV [1] is work in progress. -Debugfs abstractions --------------------- - -Rust abstraction for debugfs APIs. - -| Reference: Export GSP log buffers -| Complexity: Intermediate +| Complexity: Beginner +| Link: https://lore.kernel.org/all/20251119-rust-pci-sriov-v1-0-883a94599a97@redhat.com/ [1] GPU (general) ============= @@ -233,7 +201,10 @@ Some possible options: - maple_tree - native Rust collections +There is work in progress for using drm_buddy [1]. + | Complexity: Advanced +| Link: https://lore.kernel.org/all/20251219203805.1246586-4-joelagnelf@nvidia.com/ [1] Instance Memory --------------- diff --git a/drivers/gpu/drm/nova/driver.rs b/drivers/gpu/drm/nova/driver.rs index 2246d8e104e0..b1af0a099551 100644 --- a/drivers/gpu/drm/nova/driver.rs +++ b/drivers/gpu/drm/nova/driver.rs @@ -1,7 +1,15 @@ // SPDX-License-Identifier: GPL-2.0 use kernel::{ - auxiliary, c_str, device::Core, drm, drm::gem, drm::ioctl, prelude::*, sync::aref::ARef, + auxiliary, + device::Core, + drm::{ + self, + gem, + ioctl, // + }, + prelude::*, + sync::aref::ARef, // }; use crate::file::File; @@ -24,12 +32,12 @@ const INFO: drm::DriverInfo = drm::DriverInfo { major: 0, minor: 0, patchlevel: 0, - name: c_str!("nova"), - desc: c_str!("Nvidia Graphics"), + name: c"nova", + desc: c"Nvidia Graphics", }; -const NOVA_CORE_MODULE_NAME: &CStr = c_str!("NovaCore"); -const AUXILIARY_NAME: &CStr = c_str!("nova-drm"); +const NOVA_CORE_MODULE_NAME: &CStr = c"NovaCore"; +const AUXILIARY_NAME: &CStr = c"nova-drm"; kernel::auxiliary_device_table!( AUX_TABLE, diff --git a/drivers/gpu/drm/nova/gem.rs b/drivers/gpu/drm/nova/gem.rs index 2760ba4f3450..6ccfa5da5761 100644 --- a/drivers/gpu/drm/nova/gem.rs +++ b/drivers/gpu/drm/nova/gem.rs @@ -3,6 +3,7 @@ use kernel::{ drm, drm::{gem, gem::BaseObject}, + page, prelude::*, sync::aref::ARef, }; @@ -27,11 +28,10 @@ impl gem::DriverObject for NovaObject { impl NovaObject { /// Create a new DRM GEM object. pub(crate) fn new(dev: &NovaDevice, size: usize) -> Result<ARef<gem::Object<Self>>> { - let aligned_size = size.next_multiple_of(1 << 12); - - if size == 0 || size > aligned_size { + if size == 0 { return Err(EINVAL); } + let aligned_size = page::page_align(size).ok_or(EINVAL)?; gem::Object::new(dev, aligned_size) } diff --git a/drivers/gpu/drm/tyr/driver.rs b/drivers/gpu/drm/tyr/driver.rs index 0389c558c036..568cb89aaed8 100644 --- a/drivers/gpu/drm/tyr/driver.rs +++ b/drivers/gpu/drm/tyr/driver.rs @@ -1,6 +1,5 @@ // SPDX-License-Identifier: GPL-2.0 or MIT -use kernel::c_str; use kernel::clk::Clk; use kernel::clk::OptionalClk; use kernel::device::Bound; @@ -9,6 +8,7 @@ use kernel::device::Device; use kernel::devres::Devres; use kernel::drm; use kernel::drm::ioctl; +use kernel::io::poll; use kernel::new_mutex; use kernel::of; use kernel::platform; @@ -16,10 +16,10 @@ use kernel::prelude::*; use kernel::regulator; use kernel::regulator::Regulator; use kernel::sizes::SZ_2M; +use kernel::sync::aref::ARef; use kernel::sync::Arc; use kernel::sync::Mutex; use kernel::time; -use kernel::types::ARef; use crate::file::File; use crate::gem::TyrObject; @@ -34,7 +34,7 @@ pub(crate) type TyrDevice = drm::Device<TyrDriver>; #[pin_data(PinnedDrop)] pub(crate) struct TyrDriver { - device: ARef<TyrDevice>, + _device: ARef<TyrDevice>, } #[pin_data(PinnedDrop)] @@ -68,20 +68,13 @@ unsafe impl Sync for TyrData {} fn issue_soft_reset(dev: &Device<Bound>, iomem: &Devres<IoMem>) -> Result { regs::GPU_CMD.write(dev, iomem, regs::GPU_CMD_SOFT_RESET)?; - // TODO: We cannot poll, as there is no support in Rust currently, so we - // sleep. Change this when read_poll_timeout() is implemented in Rust. - kernel::time::delay::fsleep(time::Delta::from_millis(100)); - - if regs::GPU_IRQ_RAWSTAT.read(dev, iomem)? & regs::GPU_IRQ_RAWSTAT_RESET_COMPLETED == 0 { - dev_err!(dev, "GPU reset failed with errno\n"); - dev_err!( - dev, - "GPU_INT_RAWSTAT is {}\n", - regs::GPU_IRQ_RAWSTAT.read(dev, iomem)? - ); - - return Err(EIO); - } + poll::read_poll_timeout( + || regs::GPU_IRQ_RAWSTAT.read(dev, iomem), + |status| *status & regs::GPU_IRQ_RAWSTAT_RESET_COMPLETED != 0, + time::Delta::from_millis(1), + time::Delta::from_millis(100), + ) + .inspect_err(|_| dev_err!(dev, "GPU reset failed."))?; Ok(()) } @@ -91,8 +84,8 @@ kernel::of_device_table!( MODULE_OF_TABLE, <TyrDriver as platform::Driver>::IdInfo, [ - (of::DeviceId::new(c_str!("rockchip,rk3588-mali")), ()), - (of::DeviceId::new(c_str!("arm,mali-valhall-csf")), ()) + (of::DeviceId::new(c"rockchip,rk3588-mali"), ()), + (of::DeviceId::new(c"arm,mali-valhall-csf"), ()) ] ); @@ -104,16 +97,16 @@ impl platform::Driver for TyrDriver { pdev: &platform::Device<Core>, _info: Option<&Self::IdInfo>, ) -> impl PinInit<Self, Error> { - let core_clk = Clk::get(pdev.as_ref(), Some(c_str!("core")))?; - let stacks_clk = OptionalClk::get(pdev.as_ref(), Some(c_str!("stacks")))?; - let coregroup_clk = OptionalClk::get(pdev.as_ref(), Some(c_str!("coregroup")))?; + let core_clk = Clk::get(pdev.as_ref(), Some(c"core"))?; + let stacks_clk = OptionalClk::get(pdev.as_ref(), Some(c"stacks"))?; + let coregroup_clk = OptionalClk::get(pdev.as_ref(), Some(c"coregroup"))?; core_clk.prepare_enable()?; stacks_clk.prepare_enable()?; coregroup_clk.prepare_enable()?; - let mali_regulator = Regulator::<regulator::Enabled>::get(pdev.as_ref(), c_str!("mali"))?; - let sram_regulator = Regulator::<regulator::Enabled>::get(pdev.as_ref(), c_str!("sram"))?; + let mali_regulator = Regulator::<regulator::Enabled>::get(pdev.as_ref(), c"mali")?; + let sram_regulator = Regulator::<regulator::Enabled>::get(pdev.as_ref(), c"sram")?; let request = pdev.io_request_by_index(0).ok_or(ENODEV)?; let iomem = Arc::pin_init(request.iomap_sized::<SZ_2M>(), GFP_KERNEL)?; @@ -134,8 +127,8 @@ impl platform::Driver for TyrDriver { coregroup: coregroup_clk, }), regulators <- new_mutex!(Regulators { - mali: mali_regulator, - sram: sram_regulator, + _mali: mali_regulator, + _sram: sram_regulator, }), gpu_info, }); @@ -143,7 +136,7 @@ impl platform::Driver for TyrDriver { let tdev: ARef<TyrDevice> = drm::Device::new(pdev.as_ref(), data)?; drm::driver::Registration::new_foreign_owned(&tdev, pdev.as_ref(), 0)?; - let driver = TyrDriver { device: tdev }; + let driver = TyrDriver { _device: tdev }; // We need this to be dev_info!() because dev_dbg!() does not work at // all in Rust for now, and we need to see whether probe succeeded. @@ -174,8 +167,8 @@ const INFO: drm::DriverInfo = drm::DriverInfo { major: 1, minor: 5, patchlevel: 0, - name: c_str!("panthor"), - desc: c_str!("ARM Mali Tyr DRM driver"), + name: c"panthor", + desc: c"ARM Mali Tyr DRM driver", }; #[vtable] @@ -200,6 +193,6 @@ struct Clocks { #[pin_data] struct Regulators { - mali: Regulator<regulator::Enabled>, - sram: Regulator<regulator::Enabled>, + _mali: Regulator<regulator::Enabled>, + _sram: Regulator<regulator::Enabled>, } diff --git a/drivers/gpu/drm/tyr/gpu.rs b/drivers/gpu/drm/tyr/gpu.rs index fb7ef7145402..6395ffcfdc57 100644 --- a/drivers/gpu/drm/tyr/gpu.rs +++ b/drivers/gpu/drm/tyr/gpu.rs @@ -1,12 +1,15 @@ // SPDX-License-Identifier: GPL-2.0 or MIT +use core::ops::Deref; +use core::ops::DerefMut; use kernel::bits::genmask_u32; use kernel::device::Bound; use kernel::device::Device; use kernel::devres::Devres; +use kernel::io::poll; use kernel::platform; use kernel::prelude::*; -use kernel::time; +use kernel::time::Delta; use kernel::transmute::AsBytes; use kernel::uapi; @@ -19,29 +22,9 @@ use crate::regs; /// # Invariants /// /// - The layout of this struct identical to the C `struct drm_panthor_gpu_info`. -#[repr(C)] -pub(crate) struct GpuInfo { - pub(crate) gpu_id: u32, - pub(crate) gpu_rev: u32, - pub(crate) csf_id: u32, - pub(crate) l2_features: u32, - pub(crate) tiler_features: u32, - pub(crate) mem_features: u32, - pub(crate) mmu_features: u32, - pub(crate) thread_features: u32, - pub(crate) max_threads: u32, - pub(crate) thread_max_workgroup_size: u32, - pub(crate) thread_max_barrier_size: u32, - pub(crate) coherency_features: u32, - pub(crate) texture_features: [u32; 4], - pub(crate) as_present: u32, - pub(crate) selected_coherency: u32, - pub(crate) shader_present: u64, - pub(crate) l2_present: u64, - pub(crate) tiler_present: u64, - pub(crate) core_features: u32, - pub(crate) pad: u32, -} +#[repr(transparent)] +#[derive(Clone, Copy)] +pub(crate) struct GpuInfo(pub(crate) uapi::drm_panthor_gpu_info); impl GpuInfo { pub(crate) fn new(dev: &Device<Bound>, iomem: &Devres<IoMem>) -> Result<Self> { @@ -74,7 +57,7 @@ impl GpuInfo { let l2_present = u64::from(regs::GPU_L2_PRESENT_LO.read(dev, iomem)?); let l2_present = l2_present | u64::from(regs::GPU_L2_PRESENT_HI.read(dev, iomem)?) << 32; - Ok(Self { + Ok(Self(uapi::drm_panthor_gpu_info { gpu_id, gpu_rev, csf_id, @@ -96,7 +79,8 @@ impl GpuInfo { tiler_present, core_features, pad: 0, - }) + gpu_features: 0, + })) } pub(crate) fn log(&self, pdev: &platform::Device) { @@ -155,6 +139,20 @@ impl GpuInfo { } } +impl Deref for GpuInfo { + type Target = uapi::drm_panthor_gpu_info; + + fn deref(&self) -> &Self::Target { + &self.0 + } +} + +impl DerefMut for GpuInfo { + fn deref_mut(&mut self) -> &mut Self::Target { + &mut self.0 + } +} + // SAFETY: `GpuInfo`'s invariant guarantees that it is the same type that is // already exposed to userspace by the C driver. This implies that it fulfills // the requirements for `AsBytes`. @@ -207,14 +205,13 @@ impl From<u32> for GpuId { pub(crate) fn l2_power_on(dev: &Device<Bound>, iomem: &Devres<IoMem>) -> Result { regs::L2_PWRON_LO.write(dev, iomem, 1)?; - // TODO: We cannot poll, as there is no support in Rust currently, so we - // sleep. Change this when read_poll_timeout() is implemented in Rust. - kernel::time::delay::fsleep(time::Delta::from_millis(100)); - - if regs::L2_READY_LO.read(dev, iomem)? != 1 { - dev_err!(dev, "Failed to power on the GPU\n"); - return Err(EIO); - } + poll::read_poll_timeout( + || regs::L2_READY_LO.read(dev, iomem), + |status| *status == 1, + Delta::from_millis(1), + Delta::from_millis(100), + ) + .inspect_err(|_| dev_err!(dev, "Failed to power on the GPU."))?; Ok(()) } diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs index b8b0cc0f2d93..5a4cc047bcfc 100644 --- a/drivers/gpu/nova-core/driver.rs +++ b/drivers/gpu/nova-core/driver.rs @@ -2,7 +2,6 @@ use kernel::{ auxiliary, - c_str, device::Core, devres::Devres, dma::Device, @@ -82,7 +81,7 @@ impl pci::Driver for NovaCore { unsafe { pdev.dma_set_mask_and_coherent(DmaMask::new::<GPU_DMA_BITS>())? }; let bar = Arc::pin_init( - pdev.iomap_region_sized::<BAR0_SIZE>(0, c_str!("nova-core/bar0")), + pdev.iomap_region_sized::<BAR0_SIZE>(0, c"nova-core/bar0"), GFP_KERNEL, )?; @@ -90,7 +89,7 @@ impl pci::Driver for NovaCore { gpu <- Gpu::new(pdev, bar.clone(), bar.access(pdev.as_ref())?), _reg <- auxiliary::Registration::new( pdev.as_ref(), - c_str!("nova-drm"), + c"nova-drm", 0, // TODO[XARR]: Once it lands, use XArray; for now we don't use the ID. crate::MODULE_NAME ), diff --git a/drivers/gpu/nova-core/falcon.rs b/drivers/gpu/nova-core/falcon.rs index 82c661aef594..37bfee1d0949 100644 --- a/drivers/gpu/nova-core/falcon.rs +++ b/drivers/gpu/nova-core/falcon.rs @@ -8,12 +8,14 @@ use hal::FalconHal; use kernel::{ device, - dma::DmaAddress, + dma::{ + DmaAddress, + DmaMask, // + }, io::poll::read_poll_timeout, prelude::*, sync::aref::ARef, time::{ - delay::fsleep, Delta, // }, }; @@ -21,6 +23,7 @@ use kernel::{ use crate::{ dma::DmaObject, driver::Bar0, + falcon::hal::LoadMethod, gpu::Chipset, num::{ FromSafeCast, @@ -237,8 +240,11 @@ impl From<PeregrineCoreSelect> for bool { /// Different types of memory present in a falcon core. #[derive(Debug, Clone, Copy, PartialEq, Eq)] pub(crate) enum FalconMem { - /// Instruction Memory. - Imem, + /// Secure Instruction Memory. + ImemSecure, + /// Non-Secure Instruction Memory. + #[expect(unused)] + ImemNonSecure, /// Data Memory. Dmem, } @@ -345,8 +351,12 @@ pub(crate) struct FalconBromParams { /// Trait for providing load parameters of falcon firmwares. pub(crate) trait FalconLoadParams { - /// Returns the load parameters for `IMEM`. - fn imem_load_params(&self) -> FalconLoadTarget; + /// Returns the load parameters for Secure `IMEM`. + fn imem_sec_load_params(&self) -> FalconLoadTarget; + + /// Returns the load parameters for Non-Secure `IMEM`, + /// used only on Turing and GA100. + fn imem_ns_load_params(&self) -> Option<FalconLoadTarget>; /// Returns the load parameters for `DMEM`. fn dmem_load_params(&self) -> FalconLoadTarget; @@ -388,48 +398,11 @@ impl<E: FalconEngine + 'static> Falcon<E> { regs::NV_PFALCON_FALCON_DMACTL::default().write(bar, &E::ID); } - /// Wait for memory scrubbing to complete. - fn reset_wait_mem_scrubbing(&self, bar: &Bar0) -> Result { - // TIMEOUT: memory scrubbing should complete in less than 20ms. - read_poll_timeout( - || Ok(regs::NV_PFALCON_FALCON_HWCFG2::read(bar, &E::ID)), - |r| r.mem_scrubbing_done(), - Delta::ZERO, - Delta::from_millis(20), - ) - .map(|_| ()) - } - - /// Reset the falcon engine. - fn reset_eng(&self, bar: &Bar0) -> Result { - let _ = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, &E::ID); - - // According to OpenRM's `kflcnPreResetWait_GA102` documentation, HW sometimes does not set - // RESET_READY so a non-failing timeout is used. - let _ = read_poll_timeout( - || Ok(regs::NV_PFALCON_FALCON_HWCFG2::read(bar, &E::ID)), - |r| r.reset_ready(), - Delta::ZERO, - Delta::from_micros(150), - ); - - regs::NV_PFALCON_FALCON_ENGINE::update(bar, &E::ID, |v| v.set_reset(true)); - - // TIMEOUT: falcon engine should not take more than 10us to reset. - fsleep(Delta::from_micros(10)); - - regs::NV_PFALCON_FALCON_ENGINE::update(bar, &E::ID, |v| v.set_reset(false)); - - self.reset_wait_mem_scrubbing(bar)?; - - Ok(()) - } - /// Reset the controller, select the falcon core, and wait for memory scrubbing to complete. pub(crate) fn reset(&self, bar: &Bar0) -> Result { - self.reset_eng(bar)?; + self.hal.reset_eng(bar)?; self.hal.select_core(self, bar)?; - self.reset_wait_mem_scrubbing(bar)?; + self.hal.reset_wait_mem_scrubbing(bar)?; regs::NV_PFALCON_FALCON_RM::default() .set_value(regs::NV_PMC_BOOT_0::read(bar).into()) @@ -448,7 +421,6 @@ impl<E: FalconEngine + 'static> Falcon<E> { fw: &F, target_mem: FalconMem, load_offsets: FalconLoadTarget, - sec: bool, ) -> Result { const DMA_LEN: u32 = 256; @@ -457,7 +429,9 @@ impl<E: FalconEngine + 'static> Falcon<E> { // // For DMEM we can fold the start offset into the DMA handle. let (src_start, dma_start) = match target_mem { - FalconMem::Imem => (load_offsets.src_start, fw.dma_handle()), + FalconMem::ImemSecure | FalconMem::ImemNonSecure => { + (load_offsets.src_start, fw.dma_handle()) + } FalconMem::Dmem => ( 0, fw.dma_handle_with_offset(load_offsets.src_start.into_safe_cast())?, @@ -466,12 +440,18 @@ impl<E: FalconEngine + 'static> Falcon<E> { if dma_start % DmaAddress::from(DMA_LEN) > 0 { dev_err!( self.dev, - "DMA transfer start addresses must be a multiple of {}", + "DMA transfer start addresses must be a multiple of {}\n", DMA_LEN ); return Err(EINVAL); } + // The DMATRFBASE/1 register pair only supports a 49-bit address. + if dma_start > DmaMask::new::<49>().value() { + dev_err!(self.dev, "DMA address {:#x} exceeds 49 bits\n", dma_start); + return Err(ERANGE); + } + // DMA transfers can only be done in units of 256 bytes. Compute how many such transfers we // need to perform. let num_transfers = load_offsets.len.div_ceil(DMA_LEN); @@ -483,11 +463,11 @@ impl<E: FalconEngine + 'static> Falcon<E> { .and_then(|size| size.checked_add(load_offsets.src_start)) { None => { - dev_err!(self.dev, "DMA transfer length overflow"); + dev_err!(self.dev, "DMA transfer length overflow\n"); return Err(EOVERFLOW); } Some(upper_bound) if usize::from_safe_cast(upper_bound) > fw.size() => { - dev_err!(self.dev, "DMA transfer goes beyond range of DMA object"); + dev_err!(self.dev, "DMA transfer goes beyond range of DMA object\n"); return Err(EINVAL); } Some(_) => (), @@ -508,8 +488,7 @@ impl<E: FalconEngine + 'static> Falcon<E> { let cmd = regs::NV_PFALCON_FALCON_DMATRFCMD::default() .set_size(DmaTrfCmdSize::Size256B) - .set_imem(target_mem == FalconMem::Imem) - .set_sec(if sec { 1 } else { 0 }); + .with_falcon_mem(target_mem); for pos in (0..num_transfers).map(|i| i * DMA_LEN) { // Perform a transfer of size `DMA_LEN`. @@ -536,15 +515,22 @@ impl<E: FalconEngine + 'static> Falcon<E> { } /// Perform a DMA load into `IMEM` and `DMEM` of `fw`, and prepare the falcon to run it. - pub(crate) fn dma_load<F: FalconFirmware<Target = E>>(&self, bar: &Bar0, fw: &F) -> Result { + fn dma_load<F: FalconFirmware<Target = E>>(&self, bar: &Bar0, fw: &F) -> Result { + // The Non-Secure section only exists on firmware used by Turing and GA100, and + // those platforms do not use DMA. + if fw.imem_ns_load_params().is_some() { + debug_assert!(false); + return Err(EINVAL); + } + self.dma_reset(bar); regs::NV_PFALCON_FBIF_TRANSCFG::update(bar, &E::ID, 0, |v| { v.set_target(FalconFbifTarget::CoherentSysmem) .set_mem_type(FalconFbifMemType::Physical) }); - self.dma_wr(bar, fw, FalconMem::Imem, fw.imem_load_params(), true)?; - self.dma_wr(bar, fw, FalconMem::Dmem, fw.dmem_load_params(), true)?; + self.dma_wr(bar, fw, FalconMem::ImemSecure, fw.imem_sec_load_params())?; + self.dma_wr(bar, fw, FalconMem::Dmem, fw.dmem_load_params())?; self.hal.program_brom(self, bar, &fw.brom_params())?; @@ -651,8 +637,15 @@ impl<E: FalconEngine + 'static> Falcon<E> { /// /// Returns `true` if the RISC-V core is active, `false` otherwise. pub(crate) fn is_riscv_active(&self, bar: &Bar0) -> bool { - let cpuctl = regs::NV_PRISCV_RISCV_CPUCTL::read(bar, &E::ID); - cpuctl.active_stat() + self.hal.is_riscv_active(bar) + } + + // Load a firmware image into Falcon memory + pub(crate) fn load<F: FalconFirmware<Target = E>>(&self, bar: &Bar0, fw: &F) -> Result { + match self.hal.load_method() { + LoadMethod::Dma => self.dma_load(bar, fw), + LoadMethod::Pio => Err(ENOTSUPP), + } } /// Write the application version to the OS register. diff --git a/drivers/gpu/nova-core/falcon/hal.rs b/drivers/gpu/nova-core/falcon/hal.rs index 8dc56a28ad65..89babd5f9325 100644 --- a/drivers/gpu/nova-core/falcon/hal.rs +++ b/drivers/gpu/nova-core/falcon/hal.rs @@ -13,6 +13,16 @@ use crate::{ }; mod ga102; +mod tu102; + +/// Method used to load data into falcon memory. Some GPU architectures need +/// PIO and others can use DMA. +pub(crate) enum LoadMethod { + /// Programmed I/O + Pio, + /// Direct Memory Access + Dma, +} /// Hardware Abstraction Layer for Falcon cores. /// @@ -37,6 +47,19 @@ pub(crate) trait FalconHal<E: FalconEngine>: Send + Sync { /// Program the boot ROM registers prior to starting a secure firmware. fn program_brom(&self, falcon: &Falcon<E>, bar: &Bar0, params: &FalconBromParams) -> Result; + + /// Check if the RISC-V core is active. + /// Returns `true` if the RISC-V core is active, `false` otherwise. + fn is_riscv_active(&self, bar: &Bar0) -> bool; + + /// Wait for memory scrubbing to complete. + fn reset_wait_mem_scrubbing(&self, bar: &Bar0) -> Result; + + /// Reset the falcon engine. + fn reset_eng(&self, bar: &Bar0) -> Result; + + /// returns the method needed to load data into Falcon memory + fn load_method(&self) -> LoadMethod; } /// Returns a boxed falcon HAL adequate for `chipset`. @@ -50,6 +73,9 @@ pub(super) fn falcon_hal<E: FalconEngine + 'static>( use Chipset::*; let hal = match chipset { + TU102 | TU104 | TU106 | TU116 | TU117 => { + KBox::new(tu102::Tu102::<E>::new(), GFP_KERNEL)? as KBox<dyn FalconHal<E>> + } GA102 | GA103 | GA104 | GA106 | GA107 | AD102 | AD103 | AD104 | AD106 | AD107 => { KBox::new(ga102::Ga102::<E>::new(), GFP_KERNEL)? as KBox<dyn FalconHal<E>> } diff --git a/drivers/gpu/nova-core/falcon/hal/ga102.rs b/drivers/gpu/nova-core/falcon/hal/ga102.rs index 69a7a95cac16..8f62df10da0a 100644 --- a/drivers/gpu/nova-core/falcon/hal/ga102.rs +++ b/drivers/gpu/nova-core/falcon/hal/ga102.rs @@ -12,6 +12,7 @@ use kernel::{ use crate::{ driver::Bar0, falcon::{ + hal::LoadMethod, Falcon, FalconBromParams, FalconEngine, @@ -52,7 +53,7 @@ fn signature_reg_fuse_version_ga102( let ucode_idx = match usize::from(ucode_id) { ucode_id @ 1..=regs::NV_FUSE_OPT_FPF_SIZE => ucode_id - 1, _ => { - dev_err!(dev, "invalid ucode id {:#x}", ucode_id); + dev_err!(dev, "invalid ucode id {:#x}\n", ucode_id); return Err(EINVAL); } }; @@ -66,7 +67,7 @@ fn signature_reg_fuse_version_ga102( } else if engine_id_mask & 0x0400 != 0 { regs::NV_FUSE_OPT_FPF_GSP_UCODE1_VERSION::read(bar, ucode_idx).data() } else { - dev_err!(dev, "unexpected engine_id_mask {:#x}", engine_id_mask); + dev_err!(dev, "unexpected engine_id_mask {:#x}\n", engine_id_mask); return Err(EINVAL); }; @@ -117,4 +118,42 @@ impl<E: FalconEngine> FalconHal<E> for Ga102<E> { fn program_brom(&self, _falcon: &Falcon<E>, bar: &Bar0, params: &FalconBromParams) -> Result { program_brom_ga102::<E>(bar, params) } + + fn is_riscv_active(&self, bar: &Bar0) -> bool { + let cpuctl = regs::NV_PRISCV_RISCV_CPUCTL::read(bar, &E::ID); + cpuctl.active_stat() + } + + fn reset_wait_mem_scrubbing(&self, bar: &Bar0) -> Result { + // TIMEOUT: memory scrubbing should complete in less than 20ms. + read_poll_timeout( + || Ok(regs::NV_PFALCON_FALCON_HWCFG2::read(bar, &E::ID)), + |r| r.mem_scrubbing_done(), + Delta::ZERO, + Delta::from_millis(20), + ) + .map(|_| ()) + } + + fn reset_eng(&self, bar: &Bar0) -> Result { + let _ = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, &E::ID); + + // According to OpenRM's `kflcnPreResetWait_GA102` documentation, HW sometimes does not set + // RESET_READY so a non-failing timeout is used. + let _ = read_poll_timeout( + || Ok(regs::NV_PFALCON_FALCON_HWCFG2::read(bar, &E::ID)), + |r| r.reset_ready(), + Delta::ZERO, + Delta::from_micros(150), + ); + + regs::NV_PFALCON_FALCON_ENGINE::reset_engine::<E>(bar); + self.reset_wait_mem_scrubbing(bar)?; + + Ok(()) + } + + fn load_method(&self) -> LoadMethod { + LoadMethod::Dma + } } diff --git a/drivers/gpu/nova-core/falcon/hal/tu102.rs b/drivers/gpu/nova-core/falcon/hal/tu102.rs new file mode 100644 index 000000000000..7de6f24cc0a0 --- /dev/null +++ b/drivers/gpu/nova-core/falcon/hal/tu102.rs @@ -0,0 +1,77 @@ +// SPDX-License-Identifier: GPL-2.0 + +use core::marker::PhantomData; + +use kernel::{ + io::poll::read_poll_timeout, + prelude::*, + time::Delta, // +}; + +use crate::{ + driver::Bar0, + falcon::{ + hal::LoadMethod, + Falcon, + FalconBromParams, + FalconEngine, // + }, + regs, // +}; + +use super::FalconHal; + +pub(super) struct Tu102<E: FalconEngine>(PhantomData<E>); + +impl<E: FalconEngine> Tu102<E> { + pub(super) fn new() -> Self { + Self(PhantomData) + } +} + +impl<E: FalconEngine> FalconHal<E> for Tu102<E> { + fn select_core(&self, _falcon: &Falcon<E>, _bar: &Bar0) -> Result { + Ok(()) + } + + fn signature_reg_fuse_version( + &self, + _falcon: &Falcon<E>, + _bar: &Bar0, + _engine_id_mask: u16, + _ucode_id: u8, + ) -> Result<u32> { + Ok(0) + } + + fn program_brom(&self, _falcon: &Falcon<E>, _bar: &Bar0, _params: &FalconBromParams) -> Result { + Ok(()) + } + + fn is_riscv_active(&self, bar: &Bar0) -> bool { + let cpuctl = regs::NV_PRISCV_RISCV_CORE_SWITCH_RISCV_STATUS::read(bar, &E::ID); + cpuctl.active_stat() + } + + fn reset_wait_mem_scrubbing(&self, bar: &Bar0) -> Result { + // TIMEOUT: memory scrubbing should complete in less than 10ms. + read_poll_timeout( + || Ok(regs::NV_PFALCON_FALCON_DMACTL::read(bar, &E::ID)), + |r| r.mem_scrubbing_done(), + Delta::ZERO, + Delta::from_millis(10), + ) + .map(|_| ()) + } + + fn reset_eng(&self, bar: &Bar0) -> Result { + regs::NV_PFALCON_FALCON_ENGINE::reset_engine::<E>(bar); + self.reset_wait_mem_scrubbing(bar)?; + + Ok(()) + } + + fn load_method(&self) -> LoadMethod { + LoadMethod::Pio + } +} diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs index 3c9cf151786c..c62abcaed547 100644 --- a/drivers/gpu/nova-core/fb.rs +++ b/drivers/gpu/nova-core/fb.rs @@ -80,7 +80,7 @@ impl SysmemFlush { let _ = hal.write_sysmem_flush_page(bar, 0).inspect_err(|e| { dev_warn!( &self.device, - "failed to unregister sysmem flush page: {:?}", + "failed to unregister sysmem flush page: {:?}\n", e ) }); diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs index 2d2008b33fb4..68779540aa28 100644 --- a/drivers/gpu/nova-core/firmware.rs +++ b/drivers/gpu/nova-core/firmware.rs @@ -4,6 +4,7 @@ //! to be loaded into a given execution unit. use core::marker::PhantomData; +use core::ops::Deref; use kernel::{ device, @@ -15,7 +16,10 @@ use kernel::{ use crate::{ dma::DmaObject, - falcon::FalconFirmware, + falcon::{ + FalconFirmware, + FalconLoadTarget, // + }, gpu, num::{ FromSafeCast, @@ -46,6 +50,46 @@ fn request_firmware( /// Structure used to describe some firmwares, notably FWSEC-FRTS. #[repr(C)] #[derive(Debug, Clone)] +pub(crate) struct FalconUCodeDescV2 { + /// Header defined by 'NV_BIT_FALCON_UCODE_DESC_HEADER_VDESC*' in OpenRM. + hdr: u32, + /// Stored size of the ucode after the header, compressed or uncompressed + stored_size: u32, + /// Uncompressed size of the ucode. If store_size == uncompressed_size, then the ucode + /// is not compressed. + pub(crate) uncompressed_size: u32, + /// Code entry point + pub(crate) virtual_entry: u32, + /// Offset after the code segment at which the Application Interface Table headers are located. + pub(crate) interface_offset: u32, + /// Base address at which to load the code segment into 'IMEM'. + pub(crate) imem_phys_base: u32, + /// Size in bytes of the code to copy into 'IMEM'. + pub(crate) imem_load_size: u32, + /// Virtual 'IMEM' address (i.e. 'tag') at which the code should start. + pub(crate) imem_virt_base: u32, + /// Virtual address of secure IMEM segment. + pub(crate) imem_sec_base: u32, + /// Size of secure IMEM segment. + pub(crate) imem_sec_size: u32, + /// Offset into stored (uncompressed) image at which DMEM begins. + pub(crate) dmem_offset: u32, + /// Base address at which to load the data segment into 'DMEM'. + pub(crate) dmem_phys_base: u32, + /// Size in bytes of the data to copy into 'DMEM'. + pub(crate) dmem_load_size: u32, + /// "Alternate" Size of data to load into IMEM. + pub(crate) alt_imem_load_size: u32, + /// "Alternate" Size of data to load into DMEM. + pub(crate) alt_dmem_load_size: u32, +} + +// SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability. +unsafe impl FromBytes for FalconUCodeDescV2 {} + +/// Structure used to describe some firmwares, notably FWSEC-FRTS. +#[repr(C)] +#[derive(Debug, Clone)] pub(crate) struct FalconUCodeDescV3 { /// Header defined by `NV_BIT_FALCON_UCODE_DESC_HEADER_VDESC*` in OpenRM. hdr: u32, @@ -76,13 +120,164 @@ pub(crate) struct FalconUCodeDescV3 { _reserved: u16, } -impl FalconUCodeDescV3 { +// SAFETY: all bit patterns are valid for this type, and it doesn't use +// interior mutability. +unsafe impl FromBytes for FalconUCodeDescV3 {} + +/// Enum wrapping the different versions of Falcon microcode descriptors. +/// +/// This allows handling both V2 and V3 descriptor formats through a +/// unified type, providing version-agnostic access to firmware metadata +/// via the [`FalconUCodeDescriptor`] trait. +#[derive(Debug, Clone)] +pub(crate) enum FalconUCodeDesc { + V2(FalconUCodeDescV2), + V3(FalconUCodeDescV3), +} + +impl Deref for FalconUCodeDesc { + type Target = dyn FalconUCodeDescriptor; + + fn deref(&self) -> &Self::Target { + match self { + FalconUCodeDesc::V2(v2) => v2, + FalconUCodeDesc::V3(v3) => v3, + } + } +} + +/// Trait providing a common interface for accessing Falcon microcode descriptor fields. +/// +/// This trait abstracts over the different descriptor versions ([`FalconUCodeDescV2`] and +/// [`FalconUCodeDescV3`]), allowing code to work with firmware metadata without needing to +/// know the specific descriptor version. Fields not present return zero. +pub(crate) trait FalconUCodeDescriptor { + fn hdr(&self) -> u32; + fn imem_load_size(&self) -> u32; + fn interface_offset(&self) -> u32; + fn dmem_load_size(&self) -> u32; + fn pkc_data_offset(&self) -> u32; + fn engine_id_mask(&self) -> u16; + fn ucode_id(&self) -> u8; + fn signature_count(&self) -> u8; + fn signature_versions(&self) -> u16; + /// Returns the size in bytes of the header. - pub(crate) fn size(&self) -> usize { + fn size(&self) -> usize { + let hdr = self.hdr(); + const HDR_SIZE_SHIFT: u32 = 16; const HDR_SIZE_MASK: u32 = 0xffff0000; + ((hdr & HDR_SIZE_MASK) >> HDR_SIZE_SHIFT).into_safe_cast() + } - ((self.hdr & HDR_SIZE_MASK) >> HDR_SIZE_SHIFT).into_safe_cast() + fn imem_sec_load_params(&self) -> FalconLoadTarget; + fn imem_ns_load_params(&self) -> Option<FalconLoadTarget>; + fn dmem_load_params(&self) -> FalconLoadTarget; +} + +impl FalconUCodeDescriptor for FalconUCodeDescV2 { + fn hdr(&self) -> u32 { + self.hdr + } + fn imem_load_size(&self) -> u32 { + self.imem_load_size + } + fn interface_offset(&self) -> u32 { + self.interface_offset + } + fn dmem_load_size(&self) -> u32 { + self.dmem_load_size + } + fn pkc_data_offset(&self) -> u32 { + 0 + } + fn engine_id_mask(&self) -> u16 { + 0 + } + fn ucode_id(&self) -> u8 { + 0 + } + fn signature_count(&self) -> u8 { + 0 + } + fn signature_versions(&self) -> u16 { + 0 + } + + fn imem_sec_load_params(&self) -> FalconLoadTarget { + FalconLoadTarget { + src_start: 0, + dst_start: self.imem_sec_base, + len: self.imem_sec_size, + } + } + + fn imem_ns_load_params(&self) -> Option<FalconLoadTarget> { + Some(FalconLoadTarget { + src_start: 0, + dst_start: self.imem_phys_base, + len: self.imem_load_size.checked_sub(self.imem_sec_size)?, + }) + } + + fn dmem_load_params(&self) -> FalconLoadTarget { + FalconLoadTarget { + src_start: self.dmem_offset, + dst_start: self.dmem_phys_base, + len: self.dmem_load_size, + } + } +} + +impl FalconUCodeDescriptor for FalconUCodeDescV3 { + fn hdr(&self) -> u32 { + self.hdr + } + fn imem_load_size(&self) -> u32 { + self.imem_load_size + } + fn interface_offset(&self) -> u32 { + self.interface_offset + } + fn dmem_load_size(&self) -> u32 { + self.dmem_load_size + } + fn pkc_data_offset(&self) -> u32 { + self.pkc_data_offset + } + fn engine_id_mask(&self) -> u16 { + self.engine_id_mask + } + fn ucode_id(&self) -> u8 { + self.ucode_id + } + fn signature_count(&self) -> u8 { + self.signature_count + } + fn signature_versions(&self) -> u16 { + self.signature_versions + } + + fn imem_sec_load_params(&self) -> FalconLoadTarget { + FalconLoadTarget { + src_start: 0, + dst_start: self.imem_phys_base, + len: self.imem_load_size, + } + } + + fn imem_ns_load_params(&self) -> Option<FalconLoadTarget> { + // Not used on V3 platforms + None + } + + fn dmem_load_params(&self) -> FalconLoadTarget { + FalconLoadTarget { + src_start: self.imem_load_size, + dst_start: self.dmem_phys_base, + len: self.dmem_load_size, + } } } diff --git a/drivers/gpu/nova-core/firmware/booter.rs b/drivers/gpu/nova-core/firmware/booter.rs index f107f753214a..86556cee8e67 100644 --- a/drivers/gpu/nova-core/firmware/booter.rs +++ b/drivers/gpu/nova-core/firmware/booter.rs @@ -251,8 +251,11 @@ impl<'a> FirmwareSignature<BooterFirmware> for BooterSignature<'a> {} /// The `Booter` loader firmware, responsible for loading the GSP. pub(crate) struct BooterFirmware { - // Load parameters for `IMEM` falcon memory. - imem_load_target: FalconLoadTarget, + // Load parameters for Secure `IMEM` falcon memory. + imem_sec_load_target: FalconLoadTarget, + // Load parameters for Non-Secure `IMEM` falcon memory, + // used only on Turing and GA100 + imem_ns_load_target: Option<FalconLoadTarget>, // Load parameters for `DMEM` falcon memory. dmem_load_target: FalconLoadTarget, // BROM falcon parameters. @@ -353,12 +356,30 @@ impl BooterFirmware { } }; + // There are two versions of Booter, one for Turing/GA100, and another for + // GA102+. The extraction of the IMEM sections differs between the two + // versions. Unfortunately, the file names are the same, and the headers + // don't indicate the versions. The only way to differentiate is by the Chipset. + let (imem_sec_dst_start, imem_ns_load_target) = if chipset <= Chipset::GA100 { + ( + app0.offset, + Some(FalconLoadTarget { + src_start: 0, + dst_start: load_hdr.os_code_offset, + len: load_hdr.os_code_size, + }), + ) + } else { + (0, None) + }; + Ok(Self { - imem_load_target: FalconLoadTarget { + imem_sec_load_target: FalconLoadTarget { src_start: app0.offset, - dst_start: 0, + dst_start: imem_sec_dst_start, len: app0.len, }, + imem_ns_load_target, dmem_load_target: FalconLoadTarget { src_start: load_hdr.os_data_offset, dst_start: 0, @@ -371,8 +392,12 @@ impl BooterFirmware { } impl FalconLoadParams for BooterFirmware { - fn imem_load_params(&self) -> FalconLoadTarget { - self.imem_load_target.clone() + fn imem_sec_load_params(&self) -> FalconLoadTarget { + self.imem_sec_load_target.clone() + } + + fn imem_ns_load_params(&self) -> Option<FalconLoadTarget> { + self.imem_ns_load_target.clone() } fn dmem_load_params(&self) -> FalconLoadTarget { @@ -384,7 +409,11 @@ impl FalconLoadParams for BooterFirmware { } fn boot_addr(&self) -> u32 { - self.imem_load_target.src_start + if let Some(ns_target) = &self.imem_ns_load_target { + ns_target.dst_start + } else { + self.imem_sec_load_target.src_start + } } } diff --git a/drivers/gpu/nova-core/firmware/fwsec.rs b/drivers/gpu/nova-core/firmware/fwsec.rs index b28e34d279f4..a8ec08a500ac 100644 --- a/drivers/gpu/nova-core/firmware/fwsec.rs +++ b/drivers/gpu/nova-core/firmware/fwsec.rs @@ -40,7 +40,7 @@ use crate::{ FalconLoadTarget, // }, firmware::{ - FalconUCodeDescV3, + FalconUCodeDesc, FirmwareDmaObject, FirmwareSignature, Signed, @@ -218,33 +218,29 @@ unsafe fn transmute_mut<T: Sized + FromBytes + AsBytes>( /// It is responsible for e.g. carving out the WPR2 region as the first step of the GSP bootflow. pub(crate) struct FwsecFirmware { /// Descriptor of the firmware. - desc: FalconUCodeDescV3, + desc: FalconUCodeDesc, /// GPU-accessible DMA object containing the firmware. ucode: FirmwareDmaObject<Self, Signed>, } impl FalconLoadParams for FwsecFirmware { - fn imem_load_params(&self) -> FalconLoadTarget { - FalconLoadTarget { - src_start: 0, - dst_start: self.desc.imem_phys_base, - len: self.desc.imem_load_size, - } + fn imem_sec_load_params(&self) -> FalconLoadTarget { + self.desc.imem_sec_load_params() + } + + fn imem_ns_load_params(&self) -> Option<FalconLoadTarget> { + self.desc.imem_ns_load_params() } fn dmem_load_params(&self) -> FalconLoadTarget { - FalconLoadTarget { - src_start: self.desc.imem_load_size, - dst_start: self.desc.dmem_phys_base, - len: self.desc.dmem_load_size, - } + self.desc.dmem_load_params() } fn brom_params(&self) -> FalconBromParams { FalconBromParams { - pkc_data_offset: self.desc.pkc_data_offset, - engine_id_mask: self.desc.engine_id_mask, - ucode_id: self.desc.ucode_id, + pkc_data_offset: self.desc.pkc_data_offset(), + engine_id_mask: self.desc.engine_id_mask(), + ucode_id: self.desc.ucode_id(), } } @@ -268,10 +264,10 @@ impl FalconFirmware for FwsecFirmware { impl FirmwareDmaObject<FwsecFirmware, Unsigned> { fn new_fwsec(dev: &Device<device::Bound>, bios: &Vbios, cmd: FwsecCommand) -> Result<Self> { let desc = bios.fwsec_image().header()?; - let ucode = bios.fwsec_image().ucode(desc)?; + let ucode = bios.fwsec_image().ucode(&desc)?; let mut dma_object = DmaObject::from_data(dev, ucode)?; - let hdr_offset = usize::from_safe_cast(desc.imem_load_size + desc.interface_offset); + let hdr_offset = usize::from_safe_cast(desc.imem_load_size() + desc.interface_offset()); // SAFETY: we have exclusive access to `dma_object`. let hdr: &FalconAppifHdrV1 = unsafe { transmute(&dma_object, hdr_offset) }?; @@ -298,7 +294,7 @@ impl FirmwareDmaObject<FwsecFirmware, Unsigned> { let dmem_mapper: &mut FalconAppifDmemmapperV3 = unsafe { transmute_mut( &mut dma_object, - (desc.imem_load_size + dmem_base).into_safe_cast(), + (desc.imem_load_size() + dmem_base).into_safe_cast(), ) }?; @@ -312,7 +308,7 @@ impl FirmwareDmaObject<FwsecFirmware, Unsigned> { let frts_cmd: &mut FrtsCmd = unsafe { transmute_mut( &mut dma_object, - (desc.imem_load_size + cmd_in_buffer_offset).into_safe_cast(), + (desc.imem_load_size() + cmd_in_buffer_offset).into_safe_cast(), ) }?; @@ -359,11 +355,12 @@ impl FwsecFirmware { // Patch signature if needed. let desc = bios.fwsec_image().header()?; - let ucode_signed = if desc.signature_count != 0 { - let sig_base_img = usize::from_safe_cast(desc.imem_load_size + desc.pkc_data_offset); - let desc_sig_versions = u32::from(desc.signature_versions); + let ucode_signed = if desc.signature_count() != 0 { + let sig_base_img = + usize::from_safe_cast(desc.imem_load_size() + desc.pkc_data_offset()); + let desc_sig_versions = u32::from(desc.signature_versions()); let reg_fuse_version = - falcon.signature_reg_fuse_version(bar, desc.engine_id_mask, desc.ucode_id)?; + falcon.signature_reg_fuse_version(bar, desc.engine_id_mask(), desc.ucode_id())?; dev_dbg!( dev, "desc_sig_versions: {:#x}, reg_fuse_version: {}\n", @@ -397,7 +394,7 @@ impl FwsecFirmware { dev_dbg!(dev, "patching signature with index {}\n", signature_idx); let signature = bios .fwsec_image() - .sigs(desc) + .sigs(&desc) .and_then(|sigs| sigs.get(signature_idx).ok_or(EINVAL))?; ucode_dma.patch_signature(signature, sig_base_img)? @@ -406,7 +403,7 @@ impl FwsecFirmware { }; Ok(FwsecFirmware { - desc: desc.clone(), + desc, ucode: ucode_signed, }) } @@ -423,7 +420,7 @@ impl FwsecFirmware { .reset(bar) .inspect_err(|e| dev_err!(dev, "Failed to reset GSP falcon: {:?}\n", e))?; falcon - .dma_load(bar, self) + .load(bar, self) .inspect_err(|e| dev_err!(dev, "Failed to load FWSEC firmware: {:?}\n", e))?; let (mbox0, _) = falcon .boot(bar, Some(0), None) diff --git a/drivers/gpu/nova-core/firmware/gsp.rs b/drivers/gpu/nova-core/firmware/gsp.rs index 0549805282ab..beabae9a1189 100644 --- a/drivers/gpu/nova-core/firmware/gsp.rs +++ b/drivers/gpu/nova-core/firmware/gsp.rs @@ -93,10 +93,7 @@ mod elf { // Get the start of the name. elf.get(name_idx..) - // Stop at the first `0`. - .and_then(|nstr| nstr.get(0..=nstr.iter().position(|b| *b == 0)?)) - // Convert into CStr. This should never fail because of the line above. - .and_then(|nstr| CStr::from_bytes_with_nul(nstr).ok()) + .and_then(|nstr| CStr::from_bytes_until_nul(nstr).ok()) // Convert into str. .and_then(|c_str| c_str.to_str().ok()) // Check that the name matches. @@ -153,82 +150,93 @@ pub(crate) struct GspFirmware { impl GspFirmware { /// Loads the GSP firmware binaries, map them into `dev`'s address-space, and creates the page /// tables expected by the GSP bootloader to load it. - pub(crate) fn new<'a, 'b>( + pub(crate) fn new<'a>( dev: &'a device::Device<device::Bound>, chipset: Chipset, - ver: &'b str, - ) -> Result<impl PinInit<Self, Error> + 'a> { - let fw = super::request_firmware(dev, chipset, "gsp", ver)?; + ver: &'a str, + ) -> impl PinInit<Self, Error> + 'a { + pin_init::pin_init_scope(move || { + let firmware = super::request_firmware(dev, chipset, "gsp", ver)?; - let fw_section = elf::elf64_section(fw.data(), ".fwimage").ok_or(EINVAL)?; + let fw_section = elf::elf64_section(firmware.data(), ".fwimage").ok_or(EINVAL)?; - let sigs_section = match chipset.arch() { - Architecture::Ampere => ".fwsignature_ga10x", - Architecture::Ada => ".fwsignature_ad10x", - _ => return Err(ENOTSUPP), - }; - let signatures = elf::elf64_section(fw.data(), sigs_section) - .ok_or(EINVAL) - .and_then(|data| DmaObject::from_data(dev, data))?; + let size = fw_section.len(); - let size = fw_section.len(); + // Move the firmware into a vmalloc'd vector and map it into the device address + // space. + let fw_vvec = VVec::with_capacity(fw_section.len(), GFP_KERNEL) + .and_then(|mut v| { + v.extend_from_slice(fw_section, GFP_KERNEL)?; + Ok(v) + }) + .map_err(|_| ENOMEM)?; - // Move the firmware into a vmalloc'd vector and map it into the device address - // space. - let fw_vvec = VVec::with_capacity(fw_section.len(), GFP_KERNEL) - .and_then(|mut v| { - v.extend_from_slice(fw_section, GFP_KERNEL)?; - Ok(v) - }) - .map_err(|_| ENOMEM)?; + Ok(try_pin_init!(Self { + fw <- SGTable::new(dev, fw_vvec, DataDirection::ToDevice, GFP_KERNEL), + level2 <- { + // Allocate the level 2 page table, map the firmware onto it, and map it into + // the device address space. + VVec::<u8>::with_capacity( + fw.iter().count() * core::mem::size_of::<u64>(), + GFP_KERNEL, + ) + .map_err(|_| ENOMEM) + .and_then(|level2| map_into_lvl(&fw, level2)) + .map(|level2| SGTable::new(dev, level2, DataDirection::ToDevice, GFP_KERNEL))? + }, + level1 <- { + // Allocate the level 1 page table, map the level 2 page table onto it, and map + // it into the device address space. + VVec::<u8>::with_capacity( + level2.iter().count() * core::mem::size_of::<u64>(), + GFP_KERNEL, + ) + .map_err(|_| ENOMEM) + .and_then(|level1| map_into_lvl(&level2, level1)) + .map(|level1| SGTable::new(dev, level1, DataDirection::ToDevice, GFP_KERNEL))? + }, + level0: { + // Allocate the level 0 page table as a device-visible DMA object, and map the + // level 1 page table onto it. - let bl = super::request_firmware(dev, chipset, "bootloader", ver)?; - let bootloader = RiscvFirmware::new(dev, &bl)?; + // Level 0 page table data. + let mut level0_data = kvec![0u8; GSP_PAGE_SIZE]?; - Ok(try_pin_init!(Self { - fw <- SGTable::new(dev, fw_vvec, DataDirection::ToDevice, GFP_KERNEL), - level2 <- { - // Allocate the level 2 page table, map the firmware onto it, and map it into the - // device address space. - VVec::<u8>::with_capacity( - fw.iter().count() * core::mem::size_of::<u64>(), - GFP_KERNEL, - ) - .map_err(|_| ENOMEM) - .and_then(|level2| map_into_lvl(&fw, level2)) - .map(|level2| SGTable::new(dev, level2, DataDirection::ToDevice, GFP_KERNEL))? - }, - level1 <- { - // Allocate the level 1 page table, map the level 2 page table onto it, and map it - // into the device address space. - VVec::<u8>::with_capacity( - level2.iter().count() * core::mem::size_of::<u64>(), - GFP_KERNEL, - ) - .map_err(|_| ENOMEM) - .and_then(|level1| map_into_lvl(&level2, level1)) - .map(|level1| SGTable::new(dev, level1, DataDirection::ToDevice, GFP_KERNEL))? - }, - level0: { - // Allocate the level 0 page table as a device-visible DMA object, and map the - // level 1 page table onto it. + // Fill level 1 page entry. + let level1_entry = level1.iter().next().ok_or(EINVAL)?; + let level1_entry_addr = level1_entry.dma_address(); + let dst = &mut level0_data[..size_of_val(&level1_entry_addr)]; + dst.copy_from_slice(&level1_entry_addr.to_le_bytes()); - // Level 0 page table data. - let mut level0_data = kvec![0u8; GSP_PAGE_SIZE]?; + // Turn the level0 page table into a [`DmaObject`]. + DmaObject::from_data(dev, &level0_data)? + }, + size, + signatures: { + let sigs_section = match chipset.arch() { + Architecture::Turing + if matches!(chipset, Chipset::TU116 | Chipset::TU117) => + { + ".fwsignature_tu11x" + } + Architecture::Turing => ".fwsignature_tu10x", + // GA100 uses the same firmware as Turing + Architecture::Ampere if chipset == Chipset::GA100 => ".fwsignature_tu10x", + Architecture::Ampere => ".fwsignature_ga10x", + Architecture::Ada => ".fwsignature_ad10x", + }; - // Fill level 1 page entry. - let level1_entry = level1.iter().next().ok_or(EINVAL)?; - let level1_entry_addr = level1_entry.dma_address(); - let dst = &mut level0_data[..size_of_val(&level1_entry_addr)]; - dst.copy_from_slice(&level1_entry_addr.to_le_bytes()); + elf::elf64_section(firmware.data(), sigs_section) + .ok_or(EINVAL) + .and_then(|data| DmaObject::from_data(dev, data))? + }, + bootloader: { + let bl = super::request_firmware(dev, chipset, "bootloader", ver)?; - // Turn the level0 page table into a [`DmaObject`]. - DmaObject::from_data(dev, &level0_data)? - }, - size, - signatures, - bootloader, - })) + RiscvFirmware::new(dev, &bl)? + }, + })) + }) } /// Returns the DMA handle of the radix3 level 0 page table. diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs index 629c9d2dc994..9b042ef1a308 100644 --- a/drivers/gpu/nova-core/gpu.rs +++ b/drivers/gpu/nova-core/gpu.rs @@ -268,7 +268,7 @@ impl Gpu { // We must wait for GFW_BOOT completion before doing any significant setup on the GPU. _: { gfw::wait_gfw_boot_completion(bar) - .inspect_err(|_| dev_err!(pdev.as_ref(), "GFW boot did not complete"))?; + .inspect_err(|_| dev_err!(pdev.as_ref(), "GFW boot did not complete\n"))?; }, sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar, spec.chipset)?, @@ -281,7 +281,7 @@ impl Gpu { sec2_falcon: Falcon::new(pdev.as_ref(), spec.chipset)?, - gsp <- Gsp::new(pdev)?, + gsp <- Gsp::new(pdev), _: { gsp.boot(pdev, bar, spec.chipset, gsp_falcon, sec2_falcon)? }, diff --git a/drivers/gpu/nova-core/gsp.rs b/drivers/gpu/nova-core/gsp.rs index fb6f74797178..174feaca0a6b 100644 --- a/drivers/gpu/nova-core/gsp.rs +++ b/drivers/gpu/nova-core/gsp.rs @@ -27,7 +27,7 @@ pub(crate) use fw::{ use crate::{ gsp::cmdq::Cmdq, gsp::fw::{ - GspArgumentsCached, + GspArgumentsPadded, LibosMemoryRegionInitArgument, // }, num, @@ -114,48 +114,45 @@ pub(crate) struct Gsp { /// Command queue. pub(crate) cmdq: Cmdq, /// RM arguments. - rmargs: CoherentAllocation<GspArgumentsCached>, + rmargs: CoherentAllocation<GspArgumentsPadded>, } impl Gsp { // Creates an in-place initializer for a `Gsp` manager for `pdev`. - pub(crate) fn new(pdev: &pci::Device<device::Bound>) -> Result<impl PinInit<Self, Error>> { - let dev = pdev.as_ref(); - let libos = CoherentAllocation::<LibosMemoryRegionInitArgument>::alloc_coherent( - dev, - GSP_PAGE_SIZE / size_of::<LibosMemoryRegionInitArgument>(), - GFP_KERNEL | __GFP_ZERO, - )?; - - // Initialise the logging structures. The OpenRM equivalents are in: - // _kgspInitLibosLoggingStructures (allocates memory for buffers) - // kgspSetupLibosInitArgs_IMPL (creates pLibosInitArgs[] array) - let loginit = LogBuffer::new(dev)?; - dma_write!(libos[0] = LibosMemoryRegionInitArgument::new("LOGINIT", &loginit.0))?; - - let logintr = LogBuffer::new(dev)?; - dma_write!(libos[1] = LibosMemoryRegionInitArgument::new("LOGINTR", &logintr.0))?; - - let logrm = LogBuffer::new(dev)?; - dma_write!(libos[2] = LibosMemoryRegionInitArgument::new("LOGRM", &logrm.0))?; - - let cmdq = Cmdq::new(dev)?; - - let rmargs = CoherentAllocation::<GspArgumentsCached>::alloc_coherent( - dev, - 1, - GFP_KERNEL | __GFP_ZERO, - )?; - dma_write!(rmargs[0] = fw::GspArgumentsCached::new(&cmdq))?; - dma_write!(libos[3] = LibosMemoryRegionInitArgument::new("RMARGS", &rmargs))?; - - Ok(try_pin_init!(Self { - libos, - loginit, - logintr, - logrm, - rmargs, - cmdq, - })) + pub(crate) fn new(pdev: &pci::Device<device::Bound>) -> impl PinInit<Self, Error> + '_ { + pin_init::pin_init_scope(move || { + let dev = pdev.as_ref(); + + Ok(try_pin_init!(Self { + libos: CoherentAllocation::<LibosMemoryRegionInitArgument>::alloc_coherent( + dev, + GSP_PAGE_SIZE / size_of::<LibosMemoryRegionInitArgument>(), + GFP_KERNEL | __GFP_ZERO, + )?, + loginit: LogBuffer::new(dev)?, + logintr: LogBuffer::new(dev)?, + logrm: LogBuffer::new(dev)?, + cmdq: Cmdq::new(dev)?, + rmargs: CoherentAllocation::<GspArgumentsPadded>::alloc_coherent( + dev, + 1, + GFP_KERNEL | __GFP_ZERO, + )?, + _: { + // Initialise the logging structures. The OpenRM equivalents are in: + // _kgspInitLibosLoggingStructures (allocates memory for buffers) + // kgspSetupLibosInitArgs_IMPL (creates pLibosInitArgs[] array) + dma_write!( + libos[0] = LibosMemoryRegionInitArgument::new("LOGINIT", &loginit.0) + )?; + dma_write!( + libos[1] = LibosMemoryRegionInitArgument::new("LOGINTR", &logintr.0) + )?; + dma_write!(libos[2] = LibosMemoryRegionInitArgument::new("LOGRM", &logrm.0))?; + dma_write!(rmargs[0].inner = fw::GspArgumentsCached::new(cmdq))?; + dma_write!(libos[3] = LibosMemoryRegionInitArgument::new("RMARGS", rmargs))?; + }, + })) + }) } } diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs index 54937606b5b0..be427fe26a58 100644 --- a/drivers/gpu/nova-core/gsp/boot.rs +++ b/drivers/gpu/nova-core/gsp/boot.rs @@ -82,7 +82,7 @@ impl super::Gsp { if frts_status != 0 { dev_err!( dev, - "FWSEC-FRTS returned with error code {:#x}", + "FWSEC-FRTS returned with error code {:#x}\n", frts_status ); @@ -139,10 +139,7 @@ impl super::Gsp { let bios = Vbios::new(dev, bar)?; - let gsp_fw = KBox::pin_init( - GspFirmware::new(dev, chipset, FIRMWARE_VERSION)?, - GFP_KERNEL, - )?; + let gsp_fw = KBox::pin_init(GspFirmware::new(dev, chipset, FIRMWARE_VERSION), GFP_KERNEL)?; let fb_layout = FbLayout::new(chipset, bar, &gsp_fw)?; dev_dbg!(dev, "{:#x?}\n", fb_layout); @@ -186,7 +183,7 @@ impl super::Gsp { ); sec2_falcon.reset(bar)?; - sec2_falcon.dma_load(bar, &booter_loader)?; + sec2_falcon.load(bar, &booter_loader)?; let wpr_handle = wpr_meta.dma_handle(); let (mbox0, mbox1) = sec2_falcon.boot( bar, @@ -241,11 +238,10 @@ impl super::Gsp { // Obtain and display basic GPU information. let info = commands::get_gsp_info(&mut self.cmdq, bar)?; - dev_info!( - pdev.as_ref(), - "GPU name: {}\n", - info.gpu_name().unwrap_or("invalid GPU name") - ); + match info.gpu_name() { + Ok(name) => dev_info!(pdev.as_ref(), "GPU name: {}\n", name), + Err(e) => dev_warn!(pdev.as_ref(), "GPU name unavailable: {:?}\n", e), + } Ok(()) } diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs index 3991ccc0c10f..46819a82a51a 100644 --- a/drivers/gpu/nova-core/gsp/cmdq.rs +++ b/drivers/gpu/nova-core/gsp/cmdq.rs @@ -617,7 +617,7 @@ impl Cmdq { { dev_err!( self.dev, - "GSP RPC: receive: Call {} - bad checksum", + "GSP RPC: receive: Call {} - bad checksum\n", header.sequence() ); return Err(EIO); diff --git a/drivers/gpu/nova-core/gsp/commands.rs b/drivers/gpu/nova-core/gsp/commands.rs index 0425c65b5d6f..c8430a076269 100644 --- a/drivers/gpu/nova-core/gsp/commands.rs +++ b/drivers/gpu/nova-core/gsp/commands.rs @@ -2,7 +2,9 @@ use core::{ array, - convert::Infallible, // + convert::Infallible, + ffi::FromBytesUntilNulError, + str::Utf8Error, // }; use kernel::{ @@ -30,7 +32,6 @@ use crate::{ }, }, sbuffer::SBufferIter, - util, }; /// The `GspSetSystemInfo` command. @@ -205,11 +206,27 @@ impl MessageFromGsp for GetGspStaticInfoReply { } } +/// Error type for [`GetGspStaticInfoReply::gpu_name`]. +#[derive(Debug)] +pub(crate) enum GpuNameError { + /// The GPU name string does not contain a null terminator. + NoNullTerminator(FromBytesUntilNulError), + + /// The GPU name string contains invalid UTF-8. + #[expect(dead_code)] + InvalidUtf8(Utf8Error), +} + impl GetGspStaticInfoReply { - /// Returns the name of the GPU as a string, or `None` if the string given by the GSP was - /// invalid. - pub(crate) fn gpu_name(&self) -> Option<&str> { - util::str_from_null_terminated(&self.gpu_name) + /// Returns the name of the GPU as a string. + /// + /// Returns an error if the string given by the GSP does not contain a null terminator or + /// contains invalid UTF-8. + pub(crate) fn gpu_name(&self) -> core::result::Result<&str, GpuNameError> { + CStr::from_bytes_until_nul(&self.gpu_name) + .map_err(GpuNameError::NoNullTerminator)? + .to_str() + .map_err(GpuNameError::InvalidUtf8) } } diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs index caeb0d251fe5..83ff91614e36 100644 --- a/drivers/gpu/nova-core/gsp/fw.rs +++ b/drivers/gpu/nova-core/gsp/fw.rs @@ -904,9 +904,21 @@ impl GspArgumentsCached { // SAFETY: Padding is explicit and will not contain uninitialized data. unsafe impl AsBytes for GspArgumentsCached {} +/// On Turing and GA100, the entries in the `LibosMemoryRegionInitArgument` +/// must all be a multiple of GSP_PAGE_SIZE in size, so add padding to force it +/// to that size. +#[repr(C)] +pub(crate) struct GspArgumentsPadded { + pub(crate) inner: GspArgumentsCached, + _padding: [u8; GSP_PAGE_SIZE - core::mem::size_of::<bindings::GSP_ARGUMENTS_CACHED>()], +} + +// SAFETY: Padding is explicit and will not contain uninitialized data. +unsafe impl AsBytes for GspArgumentsPadded {} + // SAFETY: This struct only contains integer types for which all bit patterns // are valid. -unsafe impl FromBytes for GspArgumentsCached {} +unsafe impl FromBytes for GspArgumentsPadded {} /// Init arguments for the message queue. #[repr(transparent)] diff --git a/drivers/gpu/nova-core/gsp/sequencer.rs b/drivers/gpu/nova-core/gsp/sequencer.rs index 2d0369c49092..d6c489c39092 100644 --- a/drivers/gpu/nova-core/gsp/sequencer.rs +++ b/drivers/gpu/nova-core/gsp/sequencer.rs @@ -14,12 +14,12 @@ use kernel::{ device, io::poll::read_poll_timeout, prelude::*, + sync::aref::ARef, time::{ delay::fsleep, Delta, // }, - transmute::FromBytes, - types::ARef, // + transmute::FromBytes, // }; use crate::{ @@ -121,7 +121,7 @@ impl GspSeqCmd { }; if data.len() < size { - dev_err!(dev, "Data is not enough for command"); + dev_err!(dev, "Data is not enough for command\n"); return Err(EINVAL); } @@ -320,7 +320,7 @@ impl<'a> Iterator for GspSeqIter<'a> { cmd_result.map_or_else( |_err| { - dev_err!(self.dev, "Error parsing command at offset {}", offset); + dev_err!(self.dev, "Error parsing command at offset {}\n", offset); None }, |(cmd, size)| { @@ -382,7 +382,7 @@ impl<'a> GspSequencer<'a> { dev: params.dev, }; - dev_dbg!(sequencer.dev, "Running CPU Sequencer commands"); + dev_dbg!(sequencer.dev, "Running CPU Sequencer commands\n"); for cmd_result in sequencer.iter() { match cmd_result { @@ -390,7 +390,7 @@ impl<'a> GspSequencer<'a> { Err(e) => { dev_err!( sequencer.dev, - "Error running command at index {}", + "Error running command at index {}\n", sequencer.seq_info.cmd_index ); return Err(e); @@ -400,7 +400,7 @@ impl<'a> GspSequencer<'a> { dev_dbg!( sequencer.dev, - "CPU Sequencer commands completed successfully" + "CPU Sequencer commands completed successfully\n" ); Ok(()) } diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs index b98a1c03f13d..c1121e7c64c5 100644 --- a/drivers/gpu/nova-core/nova_core.rs +++ b/drivers/gpu/nova-core/nova_core.rs @@ -16,7 +16,6 @@ mod gsp; mod num; mod regs; mod sbuffer; -mod util; mod vbios; pub(crate) const MODULE_NAME: &kernel::str::CStr = <LocalModule as kernel::ModuleMetadata>::NAME; diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs index 82cc6c0790e5..ea0d32f5396c 100644 --- a/drivers/gpu/nova-core/regs.rs +++ b/drivers/gpu/nova-core/regs.rs @@ -7,15 +7,21 @@ #[macro_use] pub(crate) mod macros; -use kernel::prelude::*; +use kernel::{ + prelude::*, + time, // +}; use crate::{ + driver::Bar0, falcon::{ DmaTrfCmdSize, FalconCoreRev, FalconCoreRevSubversion, + FalconEngine, FalconFbifMemType, FalconFbifTarget, + FalconMem, FalconModSelAlgo, FalconSecurityModel, PFalcon2Base, @@ -306,6 +312,13 @@ register!(NV_PFALCON_FALCON_DMACTL @ PFalconBase[0x0000010c] { 7:7 secure_stat as bool; }); +impl NV_PFALCON_FALCON_DMACTL { + /// Returns `true` if memory scrubbing is completed. + pub(crate) fn mem_scrubbing_done(self) -> bool { + !self.dmem_scrubbing() && !self.imem_scrubbing() + } +} + register!(NV_PFALCON_FALCON_DMATRFBASE @ PFalconBase[0x00000110] { 31:0 base as u32; }); @@ -325,6 +338,14 @@ register!(NV_PFALCON_FALCON_DMATRFCMD @ PFalconBase[0x00000118] { 16:16 set_dmtag as u8; }); +impl NV_PFALCON_FALCON_DMATRFCMD { + /// Programs the `imem` and `sec` fields for the given FalconMem + pub(crate) fn with_falcon_mem(self, mem: FalconMem) -> Self { + self.set_imem(mem != FalconMem::Dmem) + .set_sec(if mem == FalconMem::ImemSecure { 1 } else { 0 }) + } +} + register!(NV_PFALCON_FALCON_DMATRFFBOFFS @ PFalconBase[0x0000011c] { 31:0 offs as u32; }); @@ -349,6 +370,18 @@ register!(NV_PFALCON_FALCON_ENGINE @ PFalconBase[0x000003c0] { 0:0 reset as bool; }); +impl NV_PFALCON_FALCON_ENGINE { + /// Resets the falcon + pub(crate) fn reset_engine<E: FalconEngine>(bar: &Bar0) { + Self::read(bar, &E::ID).set_reset(true).write(bar, &E::ID); + + // TIMEOUT: falcon engine should not take more than 10us to reset. + time::delay::fsleep(time::Delta::from_micros(10)); + + Self::read(bar, &E::ID).set_reset(false).write(bar, &E::ID); + } +} + register!(NV_PFALCON_FBIF_TRANSCFG @ PFalconBase[0x00000600[8]] { 1:0 target as u8 ?=> FalconFbifTarget; 2:2 mem_type as bool => FalconFbifMemType; @@ -380,6 +413,13 @@ register!(NV_PFALCON2_FALCON_BROM_PARAADDR @ PFalcon2Base[0x00000210[1]] { // PRISCV +// RISC-V status register for debug (Turing and GA100 only). +// Reflects current RISC-V core status. +register!(NV_PRISCV_RISCV_CORE_SWITCH_RISCV_STATUS @ PFalcon2Base[0x00000240] { + 0:0 active_stat as bool, "RISC-V core active/inactive status"; +}); + +// GA102 and later register!(NV_PRISCV_RISCV_CPUCTL @ PFalcon2Base[0x00000388] { 0:0 halted as bool; 7:7 active_stat as bool; diff --git a/drivers/gpu/nova-core/util.rs b/drivers/gpu/nova-core/util.rs deleted file mode 100644 index 4b503249a3ef..000000000000 --- a/drivers/gpu/nova-core/util.rs +++ /dev/null @@ -1,16 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 - -/// Converts a null-terminated byte slice to a string, or `None` if the array does not -/// contains any null byte or contains invalid characters. -/// -/// Contrary to [`kernel::str::CStr::from_bytes_with_nul`], the null byte can be anywhere in the -/// slice, and not only in the last position. -pub(crate) fn str_from_null_terminated(bytes: &[u8]) -> Option<&str> { - use kernel::str::CStr; - - bytes - .iter() - .position(|&b| b == 0) - .and_then(|null_pos| CStr::from_bytes_with_nul(&bytes[..=null_pos]).ok()) - .and_then(|cstr| cstr.to_str().ok()) -} diff --git a/drivers/gpu/nova-core/vbios.rs b/drivers/gpu/nova-core/vbios.rs index abf423560ff4..72cba8659a2d 100644 --- a/drivers/gpu/nova-core/vbios.rs +++ b/drivers/gpu/nova-core/vbios.rs @@ -11,14 +11,16 @@ use kernel::{ Alignable, Alignment, // }, + sync::aref::ARef, transmute::FromBytes, - types::ARef, }; use crate::{ driver::Bar0, firmware::{ fwsec::Bcrt30Rsa3kSignature, + FalconUCodeDesc, + FalconUCodeDescV2, FalconUCodeDescV3, // }, num::FromSafeCast, @@ -790,7 +792,7 @@ impl PciAtBiosImage { // read the 4 bytes at the offset specified in the token let offset = usize::from(token.data_offset); let bytes: [u8; 4] = self.base.data[offset..offset + 4].try_into().map_err(|_| { - dev_err!(self.base.dev, "Failed to convert data slice to array"); + dev_err!(self.base.dev, "Failed to convert data slice to array\n"); EINVAL })?; @@ -887,11 +889,6 @@ impl PmuLookupTable { ret }; - // Debug logging of entries (dumps the table data to dmesg) - for i in (header_len..required_bytes).step_by(entry_len) { - dev_dbg!(dev, "PMU entry: {:02x?}\n", &data[i..][..entry_len]); - } - Ok(PmuLookupTable { header, table_data }) } @@ -1003,20 +1000,11 @@ impl FwSecBiosBuilder { } impl FwSecBiosImage { - /// Get the FwSec header ([`FalconUCodeDescV3`]). - pub(crate) fn header(&self) -> Result<&FalconUCodeDescV3> { + /// Get the FwSec header ([`FalconUCodeDesc`]). + pub(crate) fn header(&self) -> Result<FalconUCodeDesc> { // Get the falcon ucode offset that was found in setup_falcon_data. let falcon_ucode_offset = self.falcon_ucode_offset; - // Make sure the offset is within the data bounds. - if falcon_ucode_offset + core::mem::size_of::<FalconUCodeDescV3>() > self.base.data.len() { - dev_err!( - self.base.dev, - "fwsec-frts header not contained within BIOS bounds\n" - ); - return Err(ERANGE); - } - // Read the first 4 bytes to get the version. let hdr_bytes: [u8; 4] = self.base.data[falcon_ucode_offset..falcon_ucode_offset + 4] .try_into() @@ -1024,33 +1012,34 @@ impl FwSecBiosImage { let hdr = u32::from_le_bytes(hdr_bytes); let ver = (hdr & 0xff00) >> 8; - if ver != 3 { - dev_err!(self.base.dev, "invalid fwsec firmware version: {:?}\n", ver); - return Err(EINVAL); + let data = self.base.data.get(falcon_ucode_offset..).ok_or(EINVAL)?; + match ver { + 2 => { + let v2 = FalconUCodeDescV2::from_bytes_copy_prefix(data) + .ok_or(EINVAL)? + .0; + Ok(FalconUCodeDesc::V2(v2)) + } + 3 => { + let v3 = FalconUCodeDescV3::from_bytes_copy_prefix(data) + .ok_or(EINVAL)? + .0; + Ok(FalconUCodeDesc::V3(v3)) + } + _ => { + dev_err!(self.base.dev, "invalid fwsec firmware version: {:?}\n", ver); + Err(EINVAL) + } } - - // Return a reference to the FalconUCodeDescV3 structure. - // - // SAFETY: We have checked that `falcon_ucode_offset + size_of::<FalconUCodeDescV3>` is - // within the bounds of `data`. Also, this data vector is from ROM, and the `data` field - // in `BiosImageBase` is immutable after construction. - Ok(unsafe { - &*(self - .base - .data - .as_ptr() - .add(falcon_ucode_offset) - .cast::<FalconUCodeDescV3>()) - }) } /// Get the ucode data as a byte slice - pub(crate) fn ucode(&self, desc: &FalconUCodeDescV3) -> Result<&[u8]> { + pub(crate) fn ucode(&self, desc: &FalconUCodeDesc) -> Result<&[u8]> { let falcon_ucode_offset = self.falcon_ucode_offset; // The ucode data follows the descriptor. let ucode_data_offset = falcon_ucode_offset + desc.size(); - let size = usize::from_safe_cast(desc.imem_load_size + desc.dmem_load_size); + let size = usize::from_safe_cast(desc.imem_load_size() + desc.dmem_load_size()); // Get the data slice, checking bounds in a single operation. self.base @@ -1066,10 +1055,14 @@ impl FwSecBiosImage { } /// Get the signatures as a byte slice - pub(crate) fn sigs(&self, desc: &FalconUCodeDescV3) -> Result<&[Bcrt30Rsa3kSignature]> { + pub(crate) fn sigs(&self, desc: &FalconUCodeDesc) -> Result<&[Bcrt30Rsa3kSignature]> { + let hdr_size = match desc { + FalconUCodeDesc::V2(_v2) => core::mem::size_of::<FalconUCodeDescV2>(), + FalconUCodeDesc::V3(_v3) => core::mem::size_of::<FalconUCodeDescV3>(), + }; // The signatures data follows the descriptor. - let sigs_data_offset = self.falcon_ucode_offset + core::mem::size_of::<FalconUCodeDescV3>(); - let sigs_count = usize::from(desc.signature_count); + let sigs_data_offset = self.falcon_ucode_offset + hdr_size; + let sigs_count = usize::from(desc.signature_count()); let sigs_size = sigs_count * core::mem::size_of::<Bcrt30Rsa3kSignature>(); // Make sure the data is within bounds. diff --git a/rust/helpers/drm.c b/rust/helpers/drm.c index 450b406c6f27..fe226f7b53ef 100644 --- a/rust/helpers/drm.c +++ b/rust/helpers/drm.c @@ -5,17 +5,18 @@ #ifdef CONFIG_DRM -void rust_helper_drm_gem_object_get(struct drm_gem_object *obj) +__rust_helper void rust_helper_drm_gem_object_get(struct drm_gem_object *obj) { drm_gem_object_get(obj); } -void rust_helper_drm_gem_object_put(struct drm_gem_object *obj) +__rust_helper void rust_helper_drm_gem_object_put(struct drm_gem_object *obj) { drm_gem_object_put(obj); } -__u64 rust_helper_drm_vma_node_offset_addr(struct drm_vma_offset_node *node) +__rust_helper __u64 +rust_helper_drm_vma_node_offset_addr(struct drm_vma_offset_node *node) { return drm_vma_node_offset_addr(node); } diff --git a/rust/kernel/drm/driver.rs b/rust/kernel/drm/driver.rs index f30ee4c6245c..e09f977b5b51 100644 --- a/rust/kernel/drm/driver.rs +++ b/rust/kernel/drm/driver.rs @@ -121,7 +121,6 @@ pub trait Driver { pub struct Registration<T: Driver>(ARef<drm::Device<T>>); impl<T: Driver> Registration<T> { - /// Creates a new [`Registration`] and registers it. fn new(drm: &drm::Device<T>, flags: usize) -> Result<Self> { // SAFETY: `drm.as_raw()` is valid by the invariants of `drm::Device`. to_result(unsafe { bindings::drm_dev_register(drm.as_raw(), flags) })?; @@ -129,8 +128,9 @@ impl<T: Driver> Registration<T> { Ok(Self(drm.into())) } - /// Same as [`Registration::new`}, but transfers ownership of the [`Registration`] to - /// [`devres::register`]. + /// Registers a new [`Device`](drm::Device) with userspace. + /// + /// Ownership of the [`Registration`] object is passed to [`devres::register`]. pub fn new_foreign_owned( drm: &drm::Device<T>, dev: &device::Device<device::Bound>, diff --git a/rust/kernel/drm/gem/mod.rs b/rust/kernel/drm/gem/mod.rs index a7f682e95c01..d49a9ba02635 100644 --- a/rust/kernel/drm/gem/mod.rs +++ b/rust/kernel/drm/gem/mod.rs @@ -210,7 +210,7 @@ impl<T: DriverObject> Object<T> { // SAFETY: The arguments are all valid per the type invariants. to_result(unsafe { bindings::drm_gem_object_init(dev.as_raw(), obj.obj.get(), size) })?; - // SAFETY: We never move out of `Self`. + // SAFETY: We will never move out of `Self` as `ARef<Self>` is always treated as pinned. let ptr = KBox::into_raw(unsafe { Pin::into_inner_unchecked(obj) }); // SAFETY: `ptr` comes from `KBox::into_raw` and hence can't be NULL. @@ -253,7 +253,7 @@ impl<T: DriverObject> Object<T> { } // SAFETY: Instances of `Object<T>` are always reference-counted. -unsafe impl<T: DriverObject> crate::types::AlwaysRefCounted for Object<T> { +unsafe impl<T: DriverObject> crate::sync::aref::AlwaysRefCounted for Object<T> { fn inc_ref(&self) { // SAFETY: The existence of a shared reference guarantees that the refcount is non-zero. unsafe { bindings::drm_gem_object_get(self.as_raw()) }; @@ -293,9 +293,7 @@ impl<T: DriverObject> AllocImpl for Object<T> { } pub(super) const fn create_fops() -> bindings::file_operations { - // SAFETY: As by the type invariant, it is safe to initialize `bindings::file_operations` - // zeroed. - let mut fops: bindings::file_operations = unsafe { core::mem::zeroed() }; + let mut fops: bindings::file_operations = pin_init::zeroed(); fops.owner = core::ptr::null_mut(); fops.open = Some(bindings::drm_open); diff --git a/rust/kernel/page.rs b/rust/kernel/page.rs index 432fc0297d4a..adecb200c654 100644 --- a/rust/kernel/page.rs +++ b/rust/kernel/page.rs @@ -25,14 +25,36 @@ pub const PAGE_SIZE: usize = bindings::PAGE_SIZE; /// A bitmask that gives the page containing a given address. pub const PAGE_MASK: usize = !(PAGE_SIZE - 1); -/// Round up the given number to the next multiple of [`PAGE_SIZE`]. +/// Rounds up to the next multiple of [`PAGE_SIZE`]. /// -/// It is incorrect to pass an address where the next multiple of [`PAGE_SIZE`] doesn't fit in a -/// [`usize`]. -pub const fn page_align(addr: usize) -> usize { - // Parentheses around `PAGE_SIZE - 1` to avoid triggering overflow sanitizers in the wrong - // cases. - (addr + (PAGE_SIZE - 1)) & PAGE_MASK +/// Returns [`None`] on integer overflow. +/// +/// # Examples +/// +/// ``` +/// use kernel::page::{ +/// page_align, +/// PAGE_SIZE, +/// }; +/// +/// // Requested address is already aligned. +/// assert_eq!(page_align(0x0), Some(0x0)); +/// assert_eq!(page_align(PAGE_SIZE), Some(PAGE_SIZE)); +/// +/// // Requested address needs alignment up. +/// assert_eq!(page_align(0x1), Some(PAGE_SIZE)); +/// assert_eq!(page_align(PAGE_SIZE + 1), Some(2 * PAGE_SIZE)); +/// +/// // Requested address causes overflow (returns `None`). +/// let overflow_addr = usize::MAX - (PAGE_SIZE / 2); +/// assert_eq!(page_align(overflow_addr), None); +/// ``` +#[inline(always)] +pub const fn page_align(addr: usize) -> Option<usize> { + let Some(sum) = addr.checked_add(PAGE_SIZE - 1) else { + return None; + }; + Some(sum & PAGE_MASK) } /// Representation of a non-owning reference to a [`Page`]. |
