summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorDave Airlie <airlied@redhat.com>2026-01-28 13:35:17 +1000
committerDave Airlie <airlied@redhat.com>2026-01-28 13:35:23 +1000
commit15392f76405ecb953216b437bed76ffa49cefb7b (patch)
tree475b2e745389418a67ccc3066e8bf9354fc536c2
parent205bd15619322a1429c1bf53831a284a12b25e2a (diff)
parentcea7b66a80412e2a5b74627b89ae25f1d0110a4b (diff)
Merge tag 'drm-rust-next-2026-01-26' of https://gitlab.freedesktop.org/drm/rust/kernel into drm-next
DRM Rust changes for v7.0-rc1 DRM: - Fix documentation for Registration constructors. - Use pin_init::zeroed() for fops initialization. - Annotate DRM helpers with __rust_helper. - Improve safety documentation for gem::Object::new(). - Update AlwaysRefCounted imports. MM: - Prevent integer overflow in page_align(). Nova (Core): - Prepare for Turing support. This includes parsing and handling Turing-specific firmware headers and sections as well as a Turing Falcon HAL implementation. - Get rid of the Result<impl PinInit<T, E>> anti-pattern. - Relocate initializer-specific code into the appropriate initializer. - Use CStr::from_bytes_until_nul() to remove custom helpers. - Improve handling of unexpected firmware values. - Clean up redundant debug prints. - Replace c_str!() with native Rust C-string literals. - Update nova-core task list. Nova (DRM): - Align GEM object size to system page size. Tyr: - Use generated uAPI bindings for GpuInfo. - Replace manual sleeps with read_poll_timeout(). - Replace c_str!() with native Rust C-string literals. - Suppress warnings for unread fields. - Fix incorrect register name in print statement. Signed-off-by: Dave Airlie <airlied@redhat.com> From: "Danilo Krummrich" <dakr@kernel.org> Link: https://patch.msgid.link/DFYW1WV6DUCG.3K8V2DAVD1Q4A@kernel.org
-rw-r--r--Documentation/gpu/nova/core/todo.rst59
-rw-r--r--drivers/gpu/drm/nova/driver.rs18
-rw-r--r--drivers/gpu/drm/nova/gem.rs6
-rw-r--r--drivers/gpu/drm/tyr/driver.rs55
-rw-r--r--drivers/gpu/drm/tyr/gpu.rs65
-rw-r--r--drivers/gpu/nova-core/driver.rs5
-rw-r--r--drivers/gpu/nova-core/falcon.rs107
-rw-r--r--drivers/gpu/nova-core/falcon/hal.rs26
-rw-r--r--drivers/gpu/nova-core/falcon/hal/ga102.rs43
-rw-r--r--drivers/gpu/nova-core/falcon/hal/tu102.rs77
-rw-r--r--drivers/gpu/nova-core/fb.rs2
-rw-r--r--drivers/gpu/nova-core/firmware.rs203
-rw-r--r--drivers/gpu/nova-core/firmware/booter.rs43
-rw-r--r--drivers/gpu/nova-core/firmware/fwsec.rs51
-rw-r--r--drivers/gpu/nova-core/firmware/gsp.rs146
-rw-r--r--drivers/gpu/nova-core/gpu.rs4
-rw-r--r--drivers/gpu/nova-core/gsp.rs77
-rw-r--r--drivers/gpu/nova-core/gsp/boot.rs18
-rw-r--r--drivers/gpu/nova-core/gsp/cmdq.rs2
-rw-r--r--drivers/gpu/nova-core/gsp/commands.rs29
-rw-r--r--drivers/gpu/nova-core/gsp/fw.rs14
-rw-r--r--drivers/gpu/nova-core/gsp/sequencer.rs14
-rw-r--r--drivers/gpu/nova-core/nova_core.rs1
-rw-r--r--drivers/gpu/nova-core/regs.rs42
-rw-r--r--drivers/gpu/nova-core/util.rs16
-rw-r--r--drivers/gpu/nova-core/vbios.rs73
-rw-r--r--rust/helpers/drm.c7
-rw-r--r--rust/kernel/drm/driver.rs6
-rw-r--r--rust/kernel/drm/gem/mod.rs8
-rw-r--r--rust/kernel/page.rs36
30 files changed, 822 insertions, 431 deletions
diff --git a/Documentation/gpu/nova/core/todo.rst b/Documentation/gpu/nova/core/todo.rst
index 35cc7c31d423..d1964eb645e2 100644
--- a/Documentation/gpu/nova/core/todo.rst
+++ b/Documentation/gpu/nova/core/todo.rst
@@ -41,8 +41,15 @@ trait [1] from the num crate.
Having this generalization also helps with implementing a generic macro that
automatically generates the corresponding mappings between a value and a number.
+FromPrimitive support has been worked on in the past, but hasn't been followed
+since then [1].
+
+There also have been considerations of ToPrimitive [2].
+
| Complexity: Beginner
| Link: https://docs.rs/num/latest/num/trait.FromPrimitive.html
+| Link: https://lore.kernel.org/all/cover.1750689857.git.y.j3ms.n@gmail.com/ [1]
+| Link: https://rust-for-linux.zulipchat.com/#narrow/channel/288089-General/topic/Implement.20.60FromPrimitive.60.20trait.20.2B.20derive.20macro.20for.20nova-core/with/541971854 [2]
Generic register abstraction [REGA]
-----------------------------------
@@ -134,21 +141,6 @@ A `num` core kernel module is being designed to provide these operations.
| Complexity: Intermediate
| Contact: Alexandre Courbot
-IRQ abstractions
-----------------
-
-Rust abstractions for IRQ handling.
-
-There is active ongoing work from Daniel Almeida [1] for the "core" abstractions
-to request IRQs.
-
-Besides optional review and testing work, the required ``pci::Device`` code
-around those core abstractions needs to be worked out.
-
-| Complexity: Intermediate
-| Link: https://lore.kernel.org/lkml/20250122163932.46697-1-daniel.almeida@collabora.com/ [1]
-| Contact: Daniel Almeida
-
Page abstraction for foreign pages
----------------------------------
@@ -161,40 +153,16 @@ There is active onging work from Abdiel Janulgue [1] and Lina [2].
| Link: https://lore.kernel.org/linux-mm/20241119112408.779243-1-abdiel.janulgue@gmail.com/ [1]
| Link: https://lore.kernel.org/rust-for-linux/20250202-rust-page-v1-0-e3170d7fe55e@asahilina.net/ [2]
-Scatterlist / sg_table abstractions
------------------------------------
-
-Rust abstractions for scatterlist / sg_table.
-
-There is preceding work from Abdiel Janulgue, which hasn't made it to the
-mailing list yet.
-
-| Complexity: Intermediate
-| Contact: Abdiel Janulgue
-
PCI MISC APIs
-------------
-Extend the existing PCI device / driver abstractions by SR-IOV, config space,
-capability, MSI API abstractions.
-
-| Complexity: Beginner
+Extend the existing PCI device / driver abstractions by SR-IOV, capability, MSI
+API abstractions.
-XArray bindings [XARR]
-----------------------
-
-We need bindings for `xa_alloc`/`xa_alloc_cyclic` in order to generate the
-auxiliary device IDs.
-
-| Complexity: Intermediate
+SR-IOV [1] is work in progress.
-Debugfs abstractions
---------------------
-
-Rust abstraction for debugfs APIs.
-
-| Reference: Export GSP log buffers
-| Complexity: Intermediate
+| Complexity: Beginner
+| Link: https://lore.kernel.org/all/20251119-rust-pci-sriov-v1-0-883a94599a97@redhat.com/ [1]
GPU (general)
=============
@@ -233,7 +201,10 @@ Some possible options:
- maple_tree
- native Rust collections
+There is work in progress for using drm_buddy [1].
+
| Complexity: Advanced
+| Link: https://lore.kernel.org/all/20251219203805.1246586-4-joelagnelf@nvidia.com/ [1]
Instance Memory
---------------
diff --git a/drivers/gpu/drm/nova/driver.rs b/drivers/gpu/drm/nova/driver.rs
index 2246d8e104e0..b1af0a099551 100644
--- a/drivers/gpu/drm/nova/driver.rs
+++ b/drivers/gpu/drm/nova/driver.rs
@@ -1,7 +1,15 @@
// SPDX-License-Identifier: GPL-2.0
use kernel::{
- auxiliary, c_str, device::Core, drm, drm::gem, drm::ioctl, prelude::*, sync::aref::ARef,
+ auxiliary,
+ device::Core,
+ drm::{
+ self,
+ gem,
+ ioctl, //
+ },
+ prelude::*,
+ sync::aref::ARef, //
};
use crate::file::File;
@@ -24,12 +32,12 @@ const INFO: drm::DriverInfo = drm::DriverInfo {
major: 0,
minor: 0,
patchlevel: 0,
- name: c_str!("nova"),
- desc: c_str!("Nvidia Graphics"),
+ name: c"nova",
+ desc: c"Nvidia Graphics",
};
-const NOVA_CORE_MODULE_NAME: &CStr = c_str!("NovaCore");
-const AUXILIARY_NAME: &CStr = c_str!("nova-drm");
+const NOVA_CORE_MODULE_NAME: &CStr = c"NovaCore";
+const AUXILIARY_NAME: &CStr = c"nova-drm";
kernel::auxiliary_device_table!(
AUX_TABLE,
diff --git a/drivers/gpu/drm/nova/gem.rs b/drivers/gpu/drm/nova/gem.rs
index 2760ba4f3450..6ccfa5da5761 100644
--- a/drivers/gpu/drm/nova/gem.rs
+++ b/drivers/gpu/drm/nova/gem.rs
@@ -3,6 +3,7 @@
use kernel::{
drm,
drm::{gem, gem::BaseObject},
+ page,
prelude::*,
sync::aref::ARef,
};
@@ -27,11 +28,10 @@ impl gem::DriverObject for NovaObject {
impl NovaObject {
/// Create a new DRM GEM object.
pub(crate) fn new(dev: &NovaDevice, size: usize) -> Result<ARef<gem::Object<Self>>> {
- let aligned_size = size.next_multiple_of(1 << 12);
-
- if size == 0 || size > aligned_size {
+ if size == 0 {
return Err(EINVAL);
}
+ let aligned_size = page::page_align(size).ok_or(EINVAL)?;
gem::Object::new(dev, aligned_size)
}
diff --git a/drivers/gpu/drm/tyr/driver.rs b/drivers/gpu/drm/tyr/driver.rs
index 0389c558c036..568cb89aaed8 100644
--- a/drivers/gpu/drm/tyr/driver.rs
+++ b/drivers/gpu/drm/tyr/driver.rs
@@ -1,6 +1,5 @@
// SPDX-License-Identifier: GPL-2.0 or MIT
-use kernel::c_str;
use kernel::clk::Clk;
use kernel::clk::OptionalClk;
use kernel::device::Bound;
@@ -9,6 +8,7 @@ use kernel::device::Device;
use kernel::devres::Devres;
use kernel::drm;
use kernel::drm::ioctl;
+use kernel::io::poll;
use kernel::new_mutex;
use kernel::of;
use kernel::platform;
@@ -16,10 +16,10 @@ use kernel::prelude::*;
use kernel::regulator;
use kernel::regulator::Regulator;
use kernel::sizes::SZ_2M;
+use kernel::sync::aref::ARef;
use kernel::sync::Arc;
use kernel::sync::Mutex;
use kernel::time;
-use kernel::types::ARef;
use crate::file::File;
use crate::gem::TyrObject;
@@ -34,7 +34,7 @@ pub(crate) type TyrDevice = drm::Device<TyrDriver>;
#[pin_data(PinnedDrop)]
pub(crate) struct TyrDriver {
- device: ARef<TyrDevice>,
+ _device: ARef<TyrDevice>,
}
#[pin_data(PinnedDrop)]
@@ -68,20 +68,13 @@ unsafe impl Sync for TyrData {}
fn issue_soft_reset(dev: &Device<Bound>, iomem: &Devres<IoMem>) -> Result {
regs::GPU_CMD.write(dev, iomem, regs::GPU_CMD_SOFT_RESET)?;
- // TODO: We cannot poll, as there is no support in Rust currently, so we
- // sleep. Change this when read_poll_timeout() is implemented in Rust.
- kernel::time::delay::fsleep(time::Delta::from_millis(100));
-
- if regs::GPU_IRQ_RAWSTAT.read(dev, iomem)? & regs::GPU_IRQ_RAWSTAT_RESET_COMPLETED == 0 {
- dev_err!(dev, "GPU reset failed with errno\n");
- dev_err!(
- dev,
- "GPU_INT_RAWSTAT is {}\n",
- regs::GPU_IRQ_RAWSTAT.read(dev, iomem)?
- );
-
- return Err(EIO);
- }
+ poll::read_poll_timeout(
+ || regs::GPU_IRQ_RAWSTAT.read(dev, iomem),
+ |status| *status & regs::GPU_IRQ_RAWSTAT_RESET_COMPLETED != 0,
+ time::Delta::from_millis(1),
+ time::Delta::from_millis(100),
+ )
+ .inspect_err(|_| dev_err!(dev, "GPU reset failed."))?;
Ok(())
}
@@ -91,8 +84,8 @@ kernel::of_device_table!(
MODULE_OF_TABLE,
<TyrDriver as platform::Driver>::IdInfo,
[
- (of::DeviceId::new(c_str!("rockchip,rk3588-mali")), ()),
- (of::DeviceId::new(c_str!("arm,mali-valhall-csf")), ())
+ (of::DeviceId::new(c"rockchip,rk3588-mali"), ()),
+ (of::DeviceId::new(c"arm,mali-valhall-csf"), ())
]
);
@@ -104,16 +97,16 @@ impl platform::Driver for TyrDriver {
pdev: &platform::Device<Core>,
_info: Option<&Self::IdInfo>,
) -> impl PinInit<Self, Error> {
- let core_clk = Clk::get(pdev.as_ref(), Some(c_str!("core")))?;
- let stacks_clk = OptionalClk::get(pdev.as_ref(), Some(c_str!("stacks")))?;
- let coregroup_clk = OptionalClk::get(pdev.as_ref(), Some(c_str!("coregroup")))?;
+ let core_clk = Clk::get(pdev.as_ref(), Some(c"core"))?;
+ let stacks_clk = OptionalClk::get(pdev.as_ref(), Some(c"stacks"))?;
+ let coregroup_clk = OptionalClk::get(pdev.as_ref(), Some(c"coregroup"))?;
core_clk.prepare_enable()?;
stacks_clk.prepare_enable()?;
coregroup_clk.prepare_enable()?;
- let mali_regulator = Regulator::<regulator::Enabled>::get(pdev.as_ref(), c_str!("mali"))?;
- let sram_regulator = Regulator::<regulator::Enabled>::get(pdev.as_ref(), c_str!("sram"))?;
+ let mali_regulator = Regulator::<regulator::Enabled>::get(pdev.as_ref(), c"mali")?;
+ let sram_regulator = Regulator::<regulator::Enabled>::get(pdev.as_ref(), c"sram")?;
let request = pdev.io_request_by_index(0).ok_or(ENODEV)?;
let iomem = Arc::pin_init(request.iomap_sized::<SZ_2M>(), GFP_KERNEL)?;
@@ -134,8 +127,8 @@ impl platform::Driver for TyrDriver {
coregroup: coregroup_clk,
}),
regulators <- new_mutex!(Regulators {
- mali: mali_regulator,
- sram: sram_regulator,
+ _mali: mali_regulator,
+ _sram: sram_regulator,
}),
gpu_info,
});
@@ -143,7 +136,7 @@ impl platform::Driver for TyrDriver {
let tdev: ARef<TyrDevice> = drm::Device::new(pdev.as_ref(), data)?;
drm::driver::Registration::new_foreign_owned(&tdev, pdev.as_ref(), 0)?;
- let driver = TyrDriver { device: tdev };
+ let driver = TyrDriver { _device: tdev };
// We need this to be dev_info!() because dev_dbg!() does not work at
// all in Rust for now, and we need to see whether probe succeeded.
@@ -174,8 +167,8 @@ const INFO: drm::DriverInfo = drm::DriverInfo {
major: 1,
minor: 5,
patchlevel: 0,
- name: c_str!("panthor"),
- desc: c_str!("ARM Mali Tyr DRM driver"),
+ name: c"panthor",
+ desc: c"ARM Mali Tyr DRM driver",
};
#[vtable]
@@ -200,6 +193,6 @@ struct Clocks {
#[pin_data]
struct Regulators {
- mali: Regulator<regulator::Enabled>,
- sram: Regulator<regulator::Enabled>,
+ _mali: Regulator<regulator::Enabled>,
+ _sram: Regulator<regulator::Enabled>,
}
diff --git a/drivers/gpu/drm/tyr/gpu.rs b/drivers/gpu/drm/tyr/gpu.rs
index fb7ef7145402..6395ffcfdc57 100644
--- a/drivers/gpu/drm/tyr/gpu.rs
+++ b/drivers/gpu/drm/tyr/gpu.rs
@@ -1,12 +1,15 @@
// SPDX-License-Identifier: GPL-2.0 or MIT
+use core::ops::Deref;
+use core::ops::DerefMut;
use kernel::bits::genmask_u32;
use kernel::device::Bound;
use kernel::device::Device;
use kernel::devres::Devres;
+use kernel::io::poll;
use kernel::platform;
use kernel::prelude::*;
-use kernel::time;
+use kernel::time::Delta;
use kernel::transmute::AsBytes;
use kernel::uapi;
@@ -19,29 +22,9 @@ use crate::regs;
/// # Invariants
///
/// - The layout of this struct identical to the C `struct drm_panthor_gpu_info`.
-#[repr(C)]
-pub(crate) struct GpuInfo {
- pub(crate) gpu_id: u32,
- pub(crate) gpu_rev: u32,
- pub(crate) csf_id: u32,
- pub(crate) l2_features: u32,
- pub(crate) tiler_features: u32,
- pub(crate) mem_features: u32,
- pub(crate) mmu_features: u32,
- pub(crate) thread_features: u32,
- pub(crate) max_threads: u32,
- pub(crate) thread_max_workgroup_size: u32,
- pub(crate) thread_max_barrier_size: u32,
- pub(crate) coherency_features: u32,
- pub(crate) texture_features: [u32; 4],
- pub(crate) as_present: u32,
- pub(crate) selected_coherency: u32,
- pub(crate) shader_present: u64,
- pub(crate) l2_present: u64,
- pub(crate) tiler_present: u64,
- pub(crate) core_features: u32,
- pub(crate) pad: u32,
-}
+#[repr(transparent)]
+#[derive(Clone, Copy)]
+pub(crate) struct GpuInfo(pub(crate) uapi::drm_panthor_gpu_info);
impl GpuInfo {
pub(crate) fn new(dev: &Device<Bound>, iomem: &Devres<IoMem>) -> Result<Self> {
@@ -74,7 +57,7 @@ impl GpuInfo {
let l2_present = u64::from(regs::GPU_L2_PRESENT_LO.read(dev, iomem)?);
let l2_present = l2_present | u64::from(regs::GPU_L2_PRESENT_HI.read(dev, iomem)?) << 32;
- Ok(Self {
+ Ok(Self(uapi::drm_panthor_gpu_info {
gpu_id,
gpu_rev,
csf_id,
@@ -96,7 +79,8 @@ impl GpuInfo {
tiler_present,
core_features,
pad: 0,
- })
+ gpu_features: 0,
+ }))
}
pub(crate) fn log(&self, pdev: &platform::Device) {
@@ -155,6 +139,20 @@ impl GpuInfo {
}
}
+impl Deref for GpuInfo {
+ type Target = uapi::drm_panthor_gpu_info;
+
+ fn deref(&self) -> &Self::Target {
+ &self.0
+ }
+}
+
+impl DerefMut for GpuInfo {
+ fn deref_mut(&mut self) -> &mut Self::Target {
+ &mut self.0
+ }
+}
+
// SAFETY: `GpuInfo`'s invariant guarantees that it is the same type that is
// already exposed to userspace by the C driver. This implies that it fulfills
// the requirements for `AsBytes`.
@@ -207,14 +205,13 @@ impl From<u32> for GpuId {
pub(crate) fn l2_power_on(dev: &Device<Bound>, iomem: &Devres<IoMem>) -> Result {
regs::L2_PWRON_LO.write(dev, iomem, 1)?;
- // TODO: We cannot poll, as there is no support in Rust currently, so we
- // sleep. Change this when read_poll_timeout() is implemented in Rust.
- kernel::time::delay::fsleep(time::Delta::from_millis(100));
-
- if regs::L2_READY_LO.read(dev, iomem)? != 1 {
- dev_err!(dev, "Failed to power on the GPU\n");
- return Err(EIO);
- }
+ poll::read_poll_timeout(
+ || regs::L2_READY_LO.read(dev, iomem),
+ |status| *status == 1,
+ Delta::from_millis(1),
+ Delta::from_millis(100),
+ )
+ .inspect_err(|_| dev_err!(dev, "Failed to power on the GPU."))?;
Ok(())
}
diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
index b8b0cc0f2d93..5a4cc047bcfc 100644
--- a/drivers/gpu/nova-core/driver.rs
+++ b/drivers/gpu/nova-core/driver.rs
@@ -2,7 +2,6 @@
use kernel::{
auxiliary,
- c_str,
device::Core,
devres::Devres,
dma::Device,
@@ -82,7 +81,7 @@ impl pci::Driver for NovaCore {
unsafe { pdev.dma_set_mask_and_coherent(DmaMask::new::<GPU_DMA_BITS>())? };
let bar = Arc::pin_init(
- pdev.iomap_region_sized::<BAR0_SIZE>(0, c_str!("nova-core/bar0")),
+ pdev.iomap_region_sized::<BAR0_SIZE>(0, c"nova-core/bar0"),
GFP_KERNEL,
)?;
@@ -90,7 +89,7 @@ impl pci::Driver for NovaCore {
gpu <- Gpu::new(pdev, bar.clone(), bar.access(pdev.as_ref())?),
_reg <- auxiliary::Registration::new(
pdev.as_ref(),
- c_str!("nova-drm"),
+ c"nova-drm",
0, // TODO[XARR]: Once it lands, use XArray; for now we don't use the ID.
crate::MODULE_NAME
),
diff --git a/drivers/gpu/nova-core/falcon.rs b/drivers/gpu/nova-core/falcon.rs
index 82c661aef594..37bfee1d0949 100644
--- a/drivers/gpu/nova-core/falcon.rs
+++ b/drivers/gpu/nova-core/falcon.rs
@@ -8,12 +8,14 @@ use hal::FalconHal;
use kernel::{
device,
- dma::DmaAddress,
+ dma::{
+ DmaAddress,
+ DmaMask, //
+ },
io::poll::read_poll_timeout,
prelude::*,
sync::aref::ARef,
time::{
- delay::fsleep,
Delta, //
},
};
@@ -21,6 +23,7 @@ use kernel::{
use crate::{
dma::DmaObject,
driver::Bar0,
+ falcon::hal::LoadMethod,
gpu::Chipset,
num::{
FromSafeCast,
@@ -237,8 +240,11 @@ impl From<PeregrineCoreSelect> for bool {
/// Different types of memory present in a falcon core.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub(crate) enum FalconMem {
- /// Instruction Memory.
- Imem,
+ /// Secure Instruction Memory.
+ ImemSecure,
+ /// Non-Secure Instruction Memory.
+ #[expect(unused)]
+ ImemNonSecure,
/// Data Memory.
Dmem,
}
@@ -345,8 +351,12 @@ pub(crate) struct FalconBromParams {
/// Trait for providing load parameters of falcon firmwares.
pub(crate) trait FalconLoadParams {
- /// Returns the load parameters for `IMEM`.
- fn imem_load_params(&self) -> FalconLoadTarget;
+ /// Returns the load parameters for Secure `IMEM`.
+ fn imem_sec_load_params(&self) -> FalconLoadTarget;
+
+ /// Returns the load parameters for Non-Secure `IMEM`,
+ /// used only on Turing and GA100.
+ fn imem_ns_load_params(&self) -> Option<FalconLoadTarget>;
/// Returns the load parameters for `DMEM`.
fn dmem_load_params(&self) -> FalconLoadTarget;
@@ -388,48 +398,11 @@ impl<E: FalconEngine + 'static> Falcon<E> {
regs::NV_PFALCON_FALCON_DMACTL::default().write(bar, &E::ID);
}
- /// Wait for memory scrubbing to complete.
- fn reset_wait_mem_scrubbing(&self, bar: &Bar0) -> Result {
- // TIMEOUT: memory scrubbing should complete in less than 20ms.
- read_poll_timeout(
- || Ok(regs::NV_PFALCON_FALCON_HWCFG2::read(bar, &E::ID)),
- |r| r.mem_scrubbing_done(),
- Delta::ZERO,
- Delta::from_millis(20),
- )
- .map(|_| ())
- }
-
- /// Reset the falcon engine.
- fn reset_eng(&self, bar: &Bar0) -> Result {
- let _ = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, &E::ID);
-
- // According to OpenRM's `kflcnPreResetWait_GA102` documentation, HW sometimes does not set
- // RESET_READY so a non-failing timeout is used.
- let _ = read_poll_timeout(
- || Ok(regs::NV_PFALCON_FALCON_HWCFG2::read(bar, &E::ID)),
- |r| r.reset_ready(),
- Delta::ZERO,
- Delta::from_micros(150),
- );
-
- regs::NV_PFALCON_FALCON_ENGINE::update(bar, &E::ID, |v| v.set_reset(true));
-
- // TIMEOUT: falcon engine should not take more than 10us to reset.
- fsleep(Delta::from_micros(10));
-
- regs::NV_PFALCON_FALCON_ENGINE::update(bar, &E::ID, |v| v.set_reset(false));
-
- self.reset_wait_mem_scrubbing(bar)?;
-
- Ok(())
- }
-
/// Reset the controller, select the falcon core, and wait for memory scrubbing to complete.
pub(crate) fn reset(&self, bar: &Bar0) -> Result {
- self.reset_eng(bar)?;
+ self.hal.reset_eng(bar)?;
self.hal.select_core(self, bar)?;
- self.reset_wait_mem_scrubbing(bar)?;
+ self.hal.reset_wait_mem_scrubbing(bar)?;
regs::NV_PFALCON_FALCON_RM::default()
.set_value(regs::NV_PMC_BOOT_0::read(bar).into())
@@ -448,7 +421,6 @@ impl<E: FalconEngine + 'static> Falcon<E> {
fw: &F,
target_mem: FalconMem,
load_offsets: FalconLoadTarget,
- sec: bool,
) -> Result {
const DMA_LEN: u32 = 256;
@@ -457,7 +429,9 @@ impl<E: FalconEngine + 'static> Falcon<E> {
//
// For DMEM we can fold the start offset into the DMA handle.
let (src_start, dma_start) = match target_mem {
- FalconMem::Imem => (load_offsets.src_start, fw.dma_handle()),
+ FalconMem::ImemSecure | FalconMem::ImemNonSecure => {
+ (load_offsets.src_start, fw.dma_handle())
+ }
FalconMem::Dmem => (
0,
fw.dma_handle_with_offset(load_offsets.src_start.into_safe_cast())?,
@@ -466,12 +440,18 @@ impl<E: FalconEngine + 'static> Falcon<E> {
if dma_start % DmaAddress::from(DMA_LEN) > 0 {
dev_err!(
self.dev,
- "DMA transfer start addresses must be a multiple of {}",
+ "DMA transfer start addresses must be a multiple of {}\n",
DMA_LEN
);
return Err(EINVAL);
}
+ // The DMATRFBASE/1 register pair only supports a 49-bit address.
+ if dma_start > DmaMask::new::<49>().value() {
+ dev_err!(self.dev, "DMA address {:#x} exceeds 49 bits\n", dma_start);
+ return Err(ERANGE);
+ }
+
// DMA transfers can only be done in units of 256 bytes. Compute how many such transfers we
// need to perform.
let num_transfers = load_offsets.len.div_ceil(DMA_LEN);
@@ -483,11 +463,11 @@ impl<E: FalconEngine + 'static> Falcon<E> {
.and_then(|size| size.checked_add(load_offsets.src_start))
{
None => {
- dev_err!(self.dev, "DMA transfer length overflow");
+ dev_err!(self.dev, "DMA transfer length overflow\n");
return Err(EOVERFLOW);
}
Some(upper_bound) if usize::from_safe_cast(upper_bound) > fw.size() => {
- dev_err!(self.dev, "DMA transfer goes beyond range of DMA object");
+ dev_err!(self.dev, "DMA transfer goes beyond range of DMA object\n");
return Err(EINVAL);
}
Some(_) => (),
@@ -508,8 +488,7 @@ impl<E: FalconEngine + 'static> Falcon<E> {
let cmd = regs::NV_PFALCON_FALCON_DMATRFCMD::default()
.set_size(DmaTrfCmdSize::Size256B)
- .set_imem(target_mem == FalconMem::Imem)
- .set_sec(if sec { 1 } else { 0 });
+ .with_falcon_mem(target_mem);
for pos in (0..num_transfers).map(|i| i * DMA_LEN) {
// Perform a transfer of size `DMA_LEN`.
@@ -536,15 +515,22 @@ impl<E: FalconEngine + 'static> Falcon<E> {
}
/// Perform a DMA load into `IMEM` and `DMEM` of `fw`, and prepare the falcon to run it.
- pub(crate) fn dma_load<F: FalconFirmware<Target = E>>(&self, bar: &Bar0, fw: &F) -> Result {
+ fn dma_load<F: FalconFirmware<Target = E>>(&self, bar: &Bar0, fw: &F) -> Result {
+ // The Non-Secure section only exists on firmware used by Turing and GA100, and
+ // those platforms do not use DMA.
+ if fw.imem_ns_load_params().is_some() {
+ debug_assert!(false);
+ return Err(EINVAL);
+ }
+
self.dma_reset(bar);
regs::NV_PFALCON_FBIF_TRANSCFG::update(bar, &E::ID, 0, |v| {
v.set_target(FalconFbifTarget::CoherentSysmem)
.set_mem_type(FalconFbifMemType::Physical)
});
- self.dma_wr(bar, fw, FalconMem::Imem, fw.imem_load_params(), true)?;
- self.dma_wr(bar, fw, FalconMem::Dmem, fw.dmem_load_params(), true)?;
+ self.dma_wr(bar, fw, FalconMem::ImemSecure, fw.imem_sec_load_params())?;
+ self.dma_wr(bar, fw, FalconMem::Dmem, fw.dmem_load_params())?;
self.hal.program_brom(self, bar, &fw.brom_params())?;
@@ -651,8 +637,15 @@ impl<E: FalconEngine + 'static> Falcon<E> {
///
/// Returns `true` if the RISC-V core is active, `false` otherwise.
pub(crate) fn is_riscv_active(&self, bar: &Bar0) -> bool {
- let cpuctl = regs::NV_PRISCV_RISCV_CPUCTL::read(bar, &E::ID);
- cpuctl.active_stat()
+ self.hal.is_riscv_active(bar)
+ }
+
+ // Load a firmware image into Falcon memory
+ pub(crate) fn load<F: FalconFirmware<Target = E>>(&self, bar: &Bar0, fw: &F) -> Result {
+ match self.hal.load_method() {
+ LoadMethod::Dma => self.dma_load(bar, fw),
+ LoadMethod::Pio => Err(ENOTSUPP),
+ }
}
/// Write the application version to the OS register.
diff --git a/drivers/gpu/nova-core/falcon/hal.rs b/drivers/gpu/nova-core/falcon/hal.rs
index 8dc56a28ad65..89babd5f9325 100644
--- a/drivers/gpu/nova-core/falcon/hal.rs
+++ b/drivers/gpu/nova-core/falcon/hal.rs
@@ -13,6 +13,16 @@ use crate::{
};
mod ga102;
+mod tu102;
+
+/// Method used to load data into falcon memory. Some GPU architectures need
+/// PIO and others can use DMA.
+pub(crate) enum LoadMethod {
+ /// Programmed I/O
+ Pio,
+ /// Direct Memory Access
+ Dma,
+}
/// Hardware Abstraction Layer for Falcon cores.
///
@@ -37,6 +47,19 @@ pub(crate) trait FalconHal<E: FalconEngine>: Send + Sync {
/// Program the boot ROM registers prior to starting a secure firmware.
fn program_brom(&self, falcon: &Falcon<E>, bar: &Bar0, params: &FalconBromParams) -> Result;
+
+ /// Check if the RISC-V core is active.
+ /// Returns `true` if the RISC-V core is active, `false` otherwise.
+ fn is_riscv_active(&self, bar: &Bar0) -> bool;
+
+ /// Wait for memory scrubbing to complete.
+ fn reset_wait_mem_scrubbing(&self, bar: &Bar0) -> Result;
+
+ /// Reset the falcon engine.
+ fn reset_eng(&self, bar: &Bar0) -> Result;
+
+ /// returns the method needed to load data into Falcon memory
+ fn load_method(&self) -> LoadMethod;
}
/// Returns a boxed falcon HAL adequate for `chipset`.
@@ -50,6 +73,9 @@ pub(super) fn falcon_hal<E: FalconEngine + 'static>(
use Chipset::*;
let hal = match chipset {
+ TU102 | TU104 | TU106 | TU116 | TU117 => {
+ KBox::new(tu102::Tu102::<E>::new(), GFP_KERNEL)? as KBox<dyn FalconHal<E>>
+ }
GA102 | GA103 | GA104 | GA106 | GA107 | AD102 | AD103 | AD104 | AD106 | AD107 => {
KBox::new(ga102::Ga102::<E>::new(), GFP_KERNEL)? as KBox<dyn FalconHal<E>>
}
diff --git a/drivers/gpu/nova-core/falcon/hal/ga102.rs b/drivers/gpu/nova-core/falcon/hal/ga102.rs
index 69a7a95cac16..8f62df10da0a 100644
--- a/drivers/gpu/nova-core/falcon/hal/ga102.rs
+++ b/drivers/gpu/nova-core/falcon/hal/ga102.rs
@@ -12,6 +12,7 @@ use kernel::{
use crate::{
driver::Bar0,
falcon::{
+ hal::LoadMethod,
Falcon,
FalconBromParams,
FalconEngine,
@@ -52,7 +53,7 @@ fn signature_reg_fuse_version_ga102(
let ucode_idx = match usize::from(ucode_id) {
ucode_id @ 1..=regs::NV_FUSE_OPT_FPF_SIZE => ucode_id - 1,
_ => {
- dev_err!(dev, "invalid ucode id {:#x}", ucode_id);
+ dev_err!(dev, "invalid ucode id {:#x}\n", ucode_id);
return Err(EINVAL);
}
};
@@ -66,7 +67,7 @@ fn signature_reg_fuse_version_ga102(
} else if engine_id_mask & 0x0400 != 0 {
regs::NV_FUSE_OPT_FPF_GSP_UCODE1_VERSION::read(bar, ucode_idx).data()
} else {
- dev_err!(dev, "unexpected engine_id_mask {:#x}", engine_id_mask);
+ dev_err!(dev, "unexpected engine_id_mask {:#x}\n", engine_id_mask);
return Err(EINVAL);
};
@@ -117,4 +118,42 @@ impl<E: FalconEngine> FalconHal<E> for Ga102<E> {
fn program_brom(&self, _falcon: &Falcon<E>, bar: &Bar0, params: &FalconBromParams) -> Result {
program_brom_ga102::<E>(bar, params)
}
+
+ fn is_riscv_active(&self, bar: &Bar0) -> bool {
+ let cpuctl = regs::NV_PRISCV_RISCV_CPUCTL::read(bar, &E::ID);
+ cpuctl.active_stat()
+ }
+
+ fn reset_wait_mem_scrubbing(&self, bar: &Bar0) -> Result {
+ // TIMEOUT: memory scrubbing should complete in less than 20ms.
+ read_poll_timeout(
+ || Ok(regs::NV_PFALCON_FALCON_HWCFG2::read(bar, &E::ID)),
+ |r| r.mem_scrubbing_done(),
+ Delta::ZERO,
+ Delta::from_millis(20),
+ )
+ .map(|_| ())
+ }
+
+ fn reset_eng(&self, bar: &Bar0) -> Result {
+ let _ = regs::NV_PFALCON_FALCON_HWCFG2::read(bar, &E::ID);
+
+ // According to OpenRM's `kflcnPreResetWait_GA102` documentation, HW sometimes does not set
+ // RESET_READY so a non-failing timeout is used.
+ let _ = read_poll_timeout(
+ || Ok(regs::NV_PFALCON_FALCON_HWCFG2::read(bar, &E::ID)),
+ |r| r.reset_ready(),
+ Delta::ZERO,
+ Delta::from_micros(150),
+ );
+
+ regs::NV_PFALCON_FALCON_ENGINE::reset_engine::<E>(bar);
+ self.reset_wait_mem_scrubbing(bar)?;
+
+ Ok(())
+ }
+
+ fn load_method(&self) -> LoadMethod {
+ LoadMethod::Dma
+ }
}
diff --git a/drivers/gpu/nova-core/falcon/hal/tu102.rs b/drivers/gpu/nova-core/falcon/hal/tu102.rs
new file mode 100644
index 000000000000..7de6f24cc0a0
--- /dev/null
+++ b/drivers/gpu/nova-core/falcon/hal/tu102.rs
@@ -0,0 +1,77 @@
+// SPDX-License-Identifier: GPL-2.0
+
+use core::marker::PhantomData;
+
+use kernel::{
+ io::poll::read_poll_timeout,
+ prelude::*,
+ time::Delta, //
+};
+
+use crate::{
+ driver::Bar0,
+ falcon::{
+ hal::LoadMethod,
+ Falcon,
+ FalconBromParams,
+ FalconEngine, //
+ },
+ regs, //
+};
+
+use super::FalconHal;
+
+pub(super) struct Tu102<E: FalconEngine>(PhantomData<E>);
+
+impl<E: FalconEngine> Tu102<E> {
+ pub(super) fn new() -> Self {
+ Self(PhantomData)
+ }
+}
+
+impl<E: FalconEngine> FalconHal<E> for Tu102<E> {
+ fn select_core(&self, _falcon: &Falcon<E>, _bar: &Bar0) -> Result {
+ Ok(())
+ }
+
+ fn signature_reg_fuse_version(
+ &self,
+ _falcon: &Falcon<E>,
+ _bar: &Bar0,
+ _engine_id_mask: u16,
+ _ucode_id: u8,
+ ) -> Result<u32> {
+ Ok(0)
+ }
+
+ fn program_brom(&self, _falcon: &Falcon<E>, _bar: &Bar0, _params: &FalconBromParams) -> Result {
+ Ok(())
+ }
+
+ fn is_riscv_active(&self, bar: &Bar0) -> bool {
+ let cpuctl = regs::NV_PRISCV_RISCV_CORE_SWITCH_RISCV_STATUS::read(bar, &E::ID);
+ cpuctl.active_stat()
+ }
+
+ fn reset_wait_mem_scrubbing(&self, bar: &Bar0) -> Result {
+ // TIMEOUT: memory scrubbing should complete in less than 10ms.
+ read_poll_timeout(
+ || Ok(regs::NV_PFALCON_FALCON_DMACTL::read(bar, &E::ID)),
+ |r| r.mem_scrubbing_done(),
+ Delta::ZERO,
+ Delta::from_millis(10),
+ )
+ .map(|_| ())
+ }
+
+ fn reset_eng(&self, bar: &Bar0) -> Result {
+ regs::NV_PFALCON_FALCON_ENGINE::reset_engine::<E>(bar);
+ self.reset_wait_mem_scrubbing(bar)?;
+
+ Ok(())
+ }
+
+ fn load_method(&self) -> LoadMethod {
+ LoadMethod::Pio
+ }
+}
diff --git a/drivers/gpu/nova-core/fb.rs b/drivers/gpu/nova-core/fb.rs
index 3c9cf151786c..c62abcaed547 100644
--- a/drivers/gpu/nova-core/fb.rs
+++ b/drivers/gpu/nova-core/fb.rs
@@ -80,7 +80,7 @@ impl SysmemFlush {
let _ = hal.write_sysmem_flush_page(bar, 0).inspect_err(|e| {
dev_warn!(
&self.device,
- "failed to unregister sysmem flush page: {:?}",
+ "failed to unregister sysmem flush page: {:?}\n",
e
)
});
diff --git a/drivers/gpu/nova-core/firmware.rs b/drivers/gpu/nova-core/firmware.rs
index 2d2008b33fb4..68779540aa28 100644
--- a/drivers/gpu/nova-core/firmware.rs
+++ b/drivers/gpu/nova-core/firmware.rs
@@ -4,6 +4,7 @@
//! to be loaded into a given execution unit.
use core::marker::PhantomData;
+use core::ops::Deref;
use kernel::{
device,
@@ -15,7 +16,10 @@ use kernel::{
use crate::{
dma::DmaObject,
- falcon::FalconFirmware,
+ falcon::{
+ FalconFirmware,
+ FalconLoadTarget, //
+ },
gpu,
num::{
FromSafeCast,
@@ -46,6 +50,46 @@ fn request_firmware(
/// Structure used to describe some firmwares, notably FWSEC-FRTS.
#[repr(C)]
#[derive(Debug, Clone)]
+pub(crate) struct FalconUCodeDescV2 {
+ /// Header defined by 'NV_BIT_FALCON_UCODE_DESC_HEADER_VDESC*' in OpenRM.
+ hdr: u32,
+ /// Stored size of the ucode after the header, compressed or uncompressed
+ stored_size: u32,
+ /// Uncompressed size of the ucode. If store_size == uncompressed_size, then the ucode
+ /// is not compressed.
+ pub(crate) uncompressed_size: u32,
+ /// Code entry point
+ pub(crate) virtual_entry: u32,
+ /// Offset after the code segment at which the Application Interface Table headers are located.
+ pub(crate) interface_offset: u32,
+ /// Base address at which to load the code segment into 'IMEM'.
+ pub(crate) imem_phys_base: u32,
+ /// Size in bytes of the code to copy into 'IMEM'.
+ pub(crate) imem_load_size: u32,
+ /// Virtual 'IMEM' address (i.e. 'tag') at which the code should start.
+ pub(crate) imem_virt_base: u32,
+ /// Virtual address of secure IMEM segment.
+ pub(crate) imem_sec_base: u32,
+ /// Size of secure IMEM segment.
+ pub(crate) imem_sec_size: u32,
+ /// Offset into stored (uncompressed) image at which DMEM begins.
+ pub(crate) dmem_offset: u32,
+ /// Base address at which to load the data segment into 'DMEM'.
+ pub(crate) dmem_phys_base: u32,
+ /// Size in bytes of the data to copy into 'DMEM'.
+ pub(crate) dmem_load_size: u32,
+ /// "Alternate" Size of data to load into IMEM.
+ pub(crate) alt_imem_load_size: u32,
+ /// "Alternate" Size of data to load into DMEM.
+ pub(crate) alt_dmem_load_size: u32,
+}
+
+// SAFETY: all bit patterns are valid for this type, and it doesn't use interior mutability.
+unsafe impl FromBytes for FalconUCodeDescV2 {}
+
+/// Structure used to describe some firmwares, notably FWSEC-FRTS.
+#[repr(C)]
+#[derive(Debug, Clone)]
pub(crate) struct FalconUCodeDescV3 {
/// Header defined by `NV_BIT_FALCON_UCODE_DESC_HEADER_VDESC*` in OpenRM.
hdr: u32,
@@ -76,13 +120,164 @@ pub(crate) struct FalconUCodeDescV3 {
_reserved: u16,
}
-impl FalconUCodeDescV3 {
+// SAFETY: all bit patterns are valid for this type, and it doesn't use
+// interior mutability.
+unsafe impl FromBytes for FalconUCodeDescV3 {}
+
+/// Enum wrapping the different versions of Falcon microcode descriptors.
+///
+/// This allows handling both V2 and V3 descriptor formats through a
+/// unified type, providing version-agnostic access to firmware metadata
+/// via the [`FalconUCodeDescriptor`] trait.
+#[derive(Debug, Clone)]
+pub(crate) enum FalconUCodeDesc {
+ V2(FalconUCodeDescV2),
+ V3(FalconUCodeDescV3),
+}
+
+impl Deref for FalconUCodeDesc {
+ type Target = dyn FalconUCodeDescriptor;
+
+ fn deref(&self) -> &Self::Target {
+ match self {
+ FalconUCodeDesc::V2(v2) => v2,
+ FalconUCodeDesc::V3(v3) => v3,
+ }
+ }
+}
+
+/// Trait providing a common interface for accessing Falcon microcode descriptor fields.
+///
+/// This trait abstracts over the different descriptor versions ([`FalconUCodeDescV2`] and
+/// [`FalconUCodeDescV3`]), allowing code to work with firmware metadata without needing to
+/// know the specific descriptor version. Fields not present return zero.
+pub(crate) trait FalconUCodeDescriptor {
+ fn hdr(&self) -> u32;
+ fn imem_load_size(&self) -> u32;
+ fn interface_offset(&self) -> u32;
+ fn dmem_load_size(&self) -> u32;
+ fn pkc_data_offset(&self) -> u32;
+ fn engine_id_mask(&self) -> u16;
+ fn ucode_id(&self) -> u8;
+ fn signature_count(&self) -> u8;
+ fn signature_versions(&self) -> u16;
+
/// Returns the size in bytes of the header.
- pub(crate) fn size(&self) -> usize {
+ fn size(&self) -> usize {
+ let hdr = self.hdr();
+
const HDR_SIZE_SHIFT: u32 = 16;
const HDR_SIZE_MASK: u32 = 0xffff0000;
+ ((hdr & HDR_SIZE_MASK) >> HDR_SIZE_SHIFT).into_safe_cast()
+ }
- ((self.hdr & HDR_SIZE_MASK) >> HDR_SIZE_SHIFT).into_safe_cast()
+ fn imem_sec_load_params(&self) -> FalconLoadTarget;
+ fn imem_ns_load_params(&self) -> Option<FalconLoadTarget>;
+ fn dmem_load_params(&self) -> FalconLoadTarget;
+}
+
+impl FalconUCodeDescriptor for FalconUCodeDescV2 {
+ fn hdr(&self) -> u32 {
+ self.hdr
+ }
+ fn imem_load_size(&self) -> u32 {
+ self.imem_load_size
+ }
+ fn interface_offset(&self) -> u32 {
+ self.interface_offset
+ }
+ fn dmem_load_size(&self) -> u32 {
+ self.dmem_load_size
+ }
+ fn pkc_data_offset(&self) -> u32 {
+ 0
+ }
+ fn engine_id_mask(&self) -> u16 {
+ 0
+ }
+ fn ucode_id(&self) -> u8 {
+ 0
+ }
+ fn signature_count(&self) -> u8 {
+ 0
+ }
+ fn signature_versions(&self) -> u16 {
+ 0
+ }
+
+ fn imem_sec_load_params(&self) -> FalconLoadTarget {
+ FalconLoadTarget {
+ src_start: 0,
+ dst_start: self.imem_sec_base,
+ len: self.imem_sec_size,
+ }
+ }
+
+ fn imem_ns_load_params(&self) -> Option<FalconLoadTarget> {
+ Some(FalconLoadTarget {
+ src_start: 0,
+ dst_start: self.imem_phys_base,
+ len: self.imem_load_size.checked_sub(self.imem_sec_size)?,
+ })
+ }
+
+ fn dmem_load_params(&self) -> FalconLoadTarget {
+ FalconLoadTarget {
+ src_start: self.dmem_offset,
+ dst_start: self.dmem_phys_base,
+ len: self.dmem_load_size,
+ }
+ }
+}
+
+impl FalconUCodeDescriptor for FalconUCodeDescV3 {
+ fn hdr(&self) -> u32 {
+ self.hdr
+ }
+ fn imem_load_size(&self) -> u32 {
+ self.imem_load_size
+ }
+ fn interface_offset(&self) -> u32 {
+ self.interface_offset
+ }
+ fn dmem_load_size(&self) -> u32 {
+ self.dmem_load_size
+ }
+ fn pkc_data_offset(&self) -> u32 {
+ self.pkc_data_offset
+ }
+ fn engine_id_mask(&self) -> u16 {
+ self.engine_id_mask
+ }
+ fn ucode_id(&self) -> u8 {
+ self.ucode_id
+ }
+ fn signature_count(&self) -> u8 {
+ self.signature_count
+ }
+ fn signature_versions(&self) -> u16 {
+ self.signature_versions
+ }
+
+ fn imem_sec_load_params(&self) -> FalconLoadTarget {
+ FalconLoadTarget {
+ src_start: 0,
+ dst_start: self.imem_phys_base,
+ len: self.imem_load_size,
+ }
+ }
+
+ fn imem_ns_load_params(&self) -> Option<FalconLoadTarget> {
+ // Not used on V3 platforms
+ None
+ }
+
+ fn dmem_load_params(&self) -> FalconLoadTarget {
+ FalconLoadTarget {
+ src_start: self.imem_load_size,
+ dst_start: self.dmem_phys_base,
+ len: self.dmem_load_size,
+ }
}
}
diff --git a/drivers/gpu/nova-core/firmware/booter.rs b/drivers/gpu/nova-core/firmware/booter.rs
index f107f753214a..86556cee8e67 100644
--- a/drivers/gpu/nova-core/firmware/booter.rs
+++ b/drivers/gpu/nova-core/firmware/booter.rs
@@ -251,8 +251,11 @@ impl<'a> FirmwareSignature<BooterFirmware> for BooterSignature<'a> {}
/// The `Booter` loader firmware, responsible for loading the GSP.
pub(crate) struct BooterFirmware {
- // Load parameters for `IMEM` falcon memory.
- imem_load_target: FalconLoadTarget,
+ // Load parameters for Secure `IMEM` falcon memory.
+ imem_sec_load_target: FalconLoadTarget,
+ // Load parameters for Non-Secure `IMEM` falcon memory,
+ // used only on Turing and GA100
+ imem_ns_load_target: Option<FalconLoadTarget>,
// Load parameters for `DMEM` falcon memory.
dmem_load_target: FalconLoadTarget,
// BROM falcon parameters.
@@ -353,12 +356,30 @@ impl BooterFirmware {
}
};
+ // There are two versions of Booter, one for Turing/GA100, and another for
+ // GA102+. The extraction of the IMEM sections differs between the two
+ // versions. Unfortunately, the file names are the same, and the headers
+ // don't indicate the versions. The only way to differentiate is by the Chipset.
+ let (imem_sec_dst_start, imem_ns_load_target) = if chipset <= Chipset::GA100 {
+ (
+ app0.offset,
+ Some(FalconLoadTarget {
+ src_start: 0,
+ dst_start: load_hdr.os_code_offset,
+ len: load_hdr.os_code_size,
+ }),
+ )
+ } else {
+ (0, None)
+ };
+
Ok(Self {
- imem_load_target: FalconLoadTarget {
+ imem_sec_load_target: FalconLoadTarget {
src_start: app0.offset,
- dst_start: 0,
+ dst_start: imem_sec_dst_start,
len: app0.len,
},
+ imem_ns_load_target,
dmem_load_target: FalconLoadTarget {
src_start: load_hdr.os_data_offset,
dst_start: 0,
@@ -371,8 +392,12 @@ impl BooterFirmware {
}
impl FalconLoadParams for BooterFirmware {
- fn imem_load_params(&self) -> FalconLoadTarget {
- self.imem_load_target.clone()
+ fn imem_sec_load_params(&self) -> FalconLoadTarget {
+ self.imem_sec_load_target.clone()
+ }
+
+ fn imem_ns_load_params(&self) -> Option<FalconLoadTarget> {
+ self.imem_ns_load_target.clone()
}
fn dmem_load_params(&self) -> FalconLoadTarget {
@@ -384,7 +409,11 @@ impl FalconLoadParams for BooterFirmware {
}
fn boot_addr(&self) -> u32 {
- self.imem_load_target.src_start
+ if let Some(ns_target) = &self.imem_ns_load_target {
+ ns_target.dst_start
+ } else {
+ self.imem_sec_load_target.src_start
+ }
}
}
diff --git a/drivers/gpu/nova-core/firmware/fwsec.rs b/drivers/gpu/nova-core/firmware/fwsec.rs
index b28e34d279f4..a8ec08a500ac 100644
--- a/drivers/gpu/nova-core/firmware/fwsec.rs
+++ b/drivers/gpu/nova-core/firmware/fwsec.rs
@@ -40,7 +40,7 @@ use crate::{
FalconLoadTarget, //
},
firmware::{
- FalconUCodeDescV3,
+ FalconUCodeDesc,
FirmwareDmaObject,
FirmwareSignature,
Signed,
@@ -218,33 +218,29 @@ unsafe fn transmute_mut<T: Sized + FromBytes + AsBytes>(
/// It is responsible for e.g. carving out the WPR2 region as the first step of the GSP bootflow.
pub(crate) struct FwsecFirmware {
/// Descriptor of the firmware.
- desc: FalconUCodeDescV3,
+ desc: FalconUCodeDesc,
/// GPU-accessible DMA object containing the firmware.
ucode: FirmwareDmaObject<Self, Signed>,
}
impl FalconLoadParams for FwsecFirmware {
- fn imem_load_params(&self) -> FalconLoadTarget {
- FalconLoadTarget {
- src_start: 0,
- dst_start: self.desc.imem_phys_base,
- len: self.desc.imem_load_size,
- }
+ fn imem_sec_load_params(&self) -> FalconLoadTarget {
+ self.desc.imem_sec_load_params()
+ }
+
+ fn imem_ns_load_params(&self) -> Option<FalconLoadTarget> {
+ self.desc.imem_ns_load_params()
}
fn dmem_load_params(&self) -> FalconLoadTarget {
- FalconLoadTarget {
- src_start: self.desc.imem_load_size,
- dst_start: self.desc.dmem_phys_base,
- len: self.desc.dmem_load_size,
- }
+ self.desc.dmem_load_params()
}
fn brom_params(&self) -> FalconBromParams {
FalconBromParams {
- pkc_data_offset: self.desc.pkc_data_offset,
- engine_id_mask: self.desc.engine_id_mask,
- ucode_id: self.desc.ucode_id,
+ pkc_data_offset: self.desc.pkc_data_offset(),
+ engine_id_mask: self.desc.engine_id_mask(),
+ ucode_id: self.desc.ucode_id(),
}
}
@@ -268,10 +264,10 @@ impl FalconFirmware for FwsecFirmware {
impl FirmwareDmaObject<FwsecFirmware, Unsigned> {
fn new_fwsec(dev: &Device<device::Bound>, bios: &Vbios, cmd: FwsecCommand) -> Result<Self> {
let desc = bios.fwsec_image().header()?;
- let ucode = bios.fwsec_image().ucode(desc)?;
+ let ucode = bios.fwsec_image().ucode(&desc)?;
let mut dma_object = DmaObject::from_data(dev, ucode)?;
- let hdr_offset = usize::from_safe_cast(desc.imem_load_size + desc.interface_offset);
+ let hdr_offset = usize::from_safe_cast(desc.imem_load_size() + desc.interface_offset());
// SAFETY: we have exclusive access to `dma_object`.
let hdr: &FalconAppifHdrV1 = unsafe { transmute(&dma_object, hdr_offset) }?;
@@ -298,7 +294,7 @@ impl FirmwareDmaObject<FwsecFirmware, Unsigned> {
let dmem_mapper: &mut FalconAppifDmemmapperV3 = unsafe {
transmute_mut(
&mut dma_object,
- (desc.imem_load_size + dmem_base).into_safe_cast(),
+ (desc.imem_load_size() + dmem_base).into_safe_cast(),
)
}?;
@@ -312,7 +308,7 @@ impl FirmwareDmaObject<FwsecFirmware, Unsigned> {
let frts_cmd: &mut FrtsCmd = unsafe {
transmute_mut(
&mut dma_object,
- (desc.imem_load_size + cmd_in_buffer_offset).into_safe_cast(),
+ (desc.imem_load_size() + cmd_in_buffer_offset).into_safe_cast(),
)
}?;
@@ -359,11 +355,12 @@ impl FwsecFirmware {
// Patch signature if needed.
let desc = bios.fwsec_image().header()?;
- let ucode_signed = if desc.signature_count != 0 {
- let sig_base_img = usize::from_safe_cast(desc.imem_load_size + desc.pkc_data_offset);
- let desc_sig_versions = u32::from(desc.signature_versions);
+ let ucode_signed = if desc.signature_count() != 0 {
+ let sig_base_img =
+ usize::from_safe_cast(desc.imem_load_size() + desc.pkc_data_offset());
+ let desc_sig_versions = u32::from(desc.signature_versions());
let reg_fuse_version =
- falcon.signature_reg_fuse_version(bar, desc.engine_id_mask, desc.ucode_id)?;
+ falcon.signature_reg_fuse_version(bar, desc.engine_id_mask(), desc.ucode_id())?;
dev_dbg!(
dev,
"desc_sig_versions: {:#x}, reg_fuse_version: {}\n",
@@ -397,7 +394,7 @@ impl FwsecFirmware {
dev_dbg!(dev, "patching signature with index {}\n", signature_idx);
let signature = bios
.fwsec_image()
- .sigs(desc)
+ .sigs(&desc)
.and_then(|sigs| sigs.get(signature_idx).ok_or(EINVAL))?;
ucode_dma.patch_signature(signature, sig_base_img)?
@@ -406,7 +403,7 @@ impl FwsecFirmware {
};
Ok(FwsecFirmware {
- desc: desc.clone(),
+ desc,
ucode: ucode_signed,
})
}
@@ -423,7 +420,7 @@ impl FwsecFirmware {
.reset(bar)
.inspect_err(|e| dev_err!(dev, "Failed to reset GSP falcon: {:?}\n", e))?;
falcon
- .dma_load(bar, self)
+ .load(bar, self)
.inspect_err(|e| dev_err!(dev, "Failed to load FWSEC firmware: {:?}\n", e))?;
let (mbox0, _) = falcon
.boot(bar, Some(0), None)
diff --git a/drivers/gpu/nova-core/firmware/gsp.rs b/drivers/gpu/nova-core/firmware/gsp.rs
index 0549805282ab..beabae9a1189 100644
--- a/drivers/gpu/nova-core/firmware/gsp.rs
+++ b/drivers/gpu/nova-core/firmware/gsp.rs
@@ -93,10 +93,7 @@ mod elf {
// Get the start of the name.
elf.get(name_idx..)
- // Stop at the first `0`.
- .and_then(|nstr| nstr.get(0..=nstr.iter().position(|b| *b == 0)?))
- // Convert into CStr. This should never fail because of the line above.
- .and_then(|nstr| CStr::from_bytes_with_nul(nstr).ok())
+ .and_then(|nstr| CStr::from_bytes_until_nul(nstr).ok())
// Convert into str.
.and_then(|c_str| c_str.to_str().ok())
// Check that the name matches.
@@ -153,82 +150,93 @@ pub(crate) struct GspFirmware {
impl GspFirmware {
/// Loads the GSP firmware binaries, map them into `dev`'s address-space, and creates the page
/// tables expected by the GSP bootloader to load it.
- pub(crate) fn new<'a, 'b>(
+ pub(crate) fn new<'a>(
dev: &'a device::Device<device::Bound>,
chipset: Chipset,
- ver: &'b str,
- ) -> Result<impl PinInit<Self, Error> + 'a> {
- let fw = super::request_firmware(dev, chipset, "gsp", ver)?;
+ ver: &'a str,
+ ) -> impl PinInit<Self, Error> + 'a {
+ pin_init::pin_init_scope(move || {
+ let firmware = super::request_firmware(dev, chipset, "gsp", ver)?;
- let fw_section = elf::elf64_section(fw.data(), ".fwimage").ok_or(EINVAL)?;
+ let fw_section = elf::elf64_section(firmware.data(), ".fwimage").ok_or(EINVAL)?;
- let sigs_section = match chipset.arch() {
- Architecture::Ampere => ".fwsignature_ga10x",
- Architecture::Ada => ".fwsignature_ad10x",
- _ => return Err(ENOTSUPP),
- };
- let signatures = elf::elf64_section(fw.data(), sigs_section)
- .ok_or(EINVAL)
- .and_then(|data| DmaObject::from_data(dev, data))?;
+ let size = fw_section.len();
- let size = fw_section.len();
+ // Move the firmware into a vmalloc'd vector and map it into the device address
+ // space.
+ let fw_vvec = VVec::with_capacity(fw_section.len(), GFP_KERNEL)
+ .and_then(|mut v| {
+ v.extend_from_slice(fw_section, GFP_KERNEL)?;
+ Ok(v)
+ })
+ .map_err(|_| ENOMEM)?;
- // Move the firmware into a vmalloc'd vector and map it into the device address
- // space.
- let fw_vvec = VVec::with_capacity(fw_section.len(), GFP_KERNEL)
- .and_then(|mut v| {
- v.extend_from_slice(fw_section, GFP_KERNEL)?;
- Ok(v)
- })
- .map_err(|_| ENOMEM)?;
+ Ok(try_pin_init!(Self {
+ fw <- SGTable::new(dev, fw_vvec, DataDirection::ToDevice, GFP_KERNEL),
+ level2 <- {
+ // Allocate the level 2 page table, map the firmware onto it, and map it into
+ // the device address space.
+ VVec::<u8>::with_capacity(
+ fw.iter().count() * core::mem::size_of::<u64>(),
+ GFP_KERNEL,
+ )
+ .map_err(|_| ENOMEM)
+ .and_then(|level2| map_into_lvl(&fw, level2))
+ .map(|level2| SGTable::new(dev, level2, DataDirection::ToDevice, GFP_KERNEL))?
+ },
+ level1 <- {
+ // Allocate the level 1 page table, map the level 2 page table onto it, and map
+ // it into the device address space.
+ VVec::<u8>::with_capacity(
+ level2.iter().count() * core::mem::size_of::<u64>(),
+ GFP_KERNEL,
+ )
+ .map_err(|_| ENOMEM)
+ .and_then(|level1| map_into_lvl(&level2, level1))
+ .map(|level1| SGTable::new(dev, level1, DataDirection::ToDevice, GFP_KERNEL))?
+ },
+ level0: {
+ // Allocate the level 0 page table as a device-visible DMA object, and map the
+ // level 1 page table onto it.
- let bl = super::request_firmware(dev, chipset, "bootloader", ver)?;
- let bootloader = RiscvFirmware::new(dev, &bl)?;
+ // Level 0 page table data.
+ let mut level0_data = kvec![0u8; GSP_PAGE_SIZE]?;
- Ok(try_pin_init!(Self {
- fw <- SGTable::new(dev, fw_vvec, DataDirection::ToDevice, GFP_KERNEL),
- level2 <- {
- // Allocate the level 2 page table, map the firmware onto it, and map it into the
- // device address space.
- VVec::<u8>::with_capacity(
- fw.iter().count() * core::mem::size_of::<u64>(),
- GFP_KERNEL,
- )
- .map_err(|_| ENOMEM)
- .and_then(|level2| map_into_lvl(&fw, level2))
- .map(|level2| SGTable::new(dev, level2, DataDirection::ToDevice, GFP_KERNEL))?
- },
- level1 <- {
- // Allocate the level 1 page table, map the level 2 page table onto it, and map it
- // into the device address space.
- VVec::<u8>::with_capacity(
- level2.iter().count() * core::mem::size_of::<u64>(),
- GFP_KERNEL,
- )
- .map_err(|_| ENOMEM)
- .and_then(|level1| map_into_lvl(&level2, level1))
- .map(|level1| SGTable::new(dev, level1, DataDirection::ToDevice, GFP_KERNEL))?
- },
- level0: {
- // Allocate the level 0 page table as a device-visible DMA object, and map the
- // level 1 page table onto it.
+ // Fill level 1 page entry.
+ let level1_entry = level1.iter().next().ok_or(EINVAL)?;
+ let level1_entry_addr = level1_entry.dma_address();
+ let dst = &mut level0_data[..size_of_val(&level1_entry_addr)];
+ dst.copy_from_slice(&level1_entry_addr.to_le_bytes());
- // Level 0 page table data.
- let mut level0_data = kvec![0u8; GSP_PAGE_SIZE]?;
+ // Turn the level0 page table into a [`DmaObject`].
+ DmaObject::from_data(dev, &level0_data)?
+ },
+ size,
+ signatures: {
+ let sigs_section = match chipset.arch() {
+ Architecture::Turing
+ if matches!(chipset, Chipset::TU116 | Chipset::TU117) =>
+ {
+ ".fwsignature_tu11x"
+ }
+ Architecture::Turing => ".fwsignature_tu10x",
+ // GA100 uses the same firmware as Turing
+ Architecture::Ampere if chipset == Chipset::GA100 => ".fwsignature_tu10x",
+ Architecture::Ampere => ".fwsignature_ga10x",
+ Architecture::Ada => ".fwsignature_ad10x",
+ };
- // Fill level 1 page entry.
- let level1_entry = level1.iter().next().ok_or(EINVAL)?;
- let level1_entry_addr = level1_entry.dma_address();
- let dst = &mut level0_data[..size_of_val(&level1_entry_addr)];
- dst.copy_from_slice(&level1_entry_addr.to_le_bytes());
+ elf::elf64_section(firmware.data(), sigs_section)
+ .ok_or(EINVAL)
+ .and_then(|data| DmaObject::from_data(dev, data))?
+ },
+ bootloader: {
+ let bl = super::request_firmware(dev, chipset, "bootloader", ver)?;
- // Turn the level0 page table into a [`DmaObject`].
- DmaObject::from_data(dev, &level0_data)?
- },
- size,
- signatures,
- bootloader,
- }))
+ RiscvFirmware::new(dev, &bl)?
+ },
+ }))
+ })
}
/// Returns the DMA handle of the radix3 level 0 page table.
diff --git a/drivers/gpu/nova-core/gpu.rs b/drivers/gpu/nova-core/gpu.rs
index 629c9d2dc994..9b042ef1a308 100644
--- a/drivers/gpu/nova-core/gpu.rs
+++ b/drivers/gpu/nova-core/gpu.rs
@@ -268,7 +268,7 @@ impl Gpu {
// We must wait for GFW_BOOT completion before doing any significant setup on the GPU.
_: {
gfw::wait_gfw_boot_completion(bar)
- .inspect_err(|_| dev_err!(pdev.as_ref(), "GFW boot did not complete"))?;
+ .inspect_err(|_| dev_err!(pdev.as_ref(), "GFW boot did not complete\n"))?;
},
sysmem_flush: SysmemFlush::register(pdev.as_ref(), bar, spec.chipset)?,
@@ -281,7 +281,7 @@ impl Gpu {
sec2_falcon: Falcon::new(pdev.as_ref(), spec.chipset)?,
- gsp <- Gsp::new(pdev)?,
+ gsp <- Gsp::new(pdev),
_: { gsp.boot(pdev, bar, spec.chipset, gsp_falcon, sec2_falcon)? },
diff --git a/drivers/gpu/nova-core/gsp.rs b/drivers/gpu/nova-core/gsp.rs
index fb6f74797178..174feaca0a6b 100644
--- a/drivers/gpu/nova-core/gsp.rs
+++ b/drivers/gpu/nova-core/gsp.rs
@@ -27,7 +27,7 @@ pub(crate) use fw::{
use crate::{
gsp::cmdq::Cmdq,
gsp::fw::{
- GspArgumentsCached,
+ GspArgumentsPadded,
LibosMemoryRegionInitArgument, //
},
num,
@@ -114,48 +114,45 @@ pub(crate) struct Gsp {
/// Command queue.
pub(crate) cmdq: Cmdq,
/// RM arguments.
- rmargs: CoherentAllocation<GspArgumentsCached>,
+ rmargs: CoherentAllocation<GspArgumentsPadded>,
}
impl Gsp {
// Creates an in-place initializer for a `Gsp` manager for `pdev`.
- pub(crate) fn new(pdev: &pci::Device<device::Bound>) -> Result<impl PinInit<Self, Error>> {
- let dev = pdev.as_ref();
- let libos = CoherentAllocation::<LibosMemoryRegionInitArgument>::alloc_coherent(
- dev,
- GSP_PAGE_SIZE / size_of::<LibosMemoryRegionInitArgument>(),
- GFP_KERNEL | __GFP_ZERO,
- )?;
-
- // Initialise the logging structures. The OpenRM equivalents are in:
- // _kgspInitLibosLoggingStructures (allocates memory for buffers)
- // kgspSetupLibosInitArgs_IMPL (creates pLibosInitArgs[] array)
- let loginit = LogBuffer::new(dev)?;
- dma_write!(libos[0] = LibosMemoryRegionInitArgument::new("LOGINIT", &loginit.0))?;
-
- let logintr = LogBuffer::new(dev)?;
- dma_write!(libos[1] = LibosMemoryRegionInitArgument::new("LOGINTR", &logintr.0))?;
-
- let logrm = LogBuffer::new(dev)?;
- dma_write!(libos[2] = LibosMemoryRegionInitArgument::new("LOGRM", &logrm.0))?;
-
- let cmdq = Cmdq::new(dev)?;
-
- let rmargs = CoherentAllocation::<GspArgumentsCached>::alloc_coherent(
- dev,
- 1,
- GFP_KERNEL | __GFP_ZERO,
- )?;
- dma_write!(rmargs[0] = fw::GspArgumentsCached::new(&cmdq))?;
- dma_write!(libos[3] = LibosMemoryRegionInitArgument::new("RMARGS", &rmargs))?;
-
- Ok(try_pin_init!(Self {
- libos,
- loginit,
- logintr,
- logrm,
- rmargs,
- cmdq,
- }))
+ pub(crate) fn new(pdev: &pci::Device<device::Bound>) -> impl PinInit<Self, Error> + '_ {
+ pin_init::pin_init_scope(move || {
+ let dev = pdev.as_ref();
+
+ Ok(try_pin_init!(Self {
+ libos: CoherentAllocation::<LibosMemoryRegionInitArgument>::alloc_coherent(
+ dev,
+ GSP_PAGE_SIZE / size_of::<LibosMemoryRegionInitArgument>(),
+ GFP_KERNEL | __GFP_ZERO,
+ )?,
+ loginit: LogBuffer::new(dev)?,
+ logintr: LogBuffer::new(dev)?,
+ logrm: LogBuffer::new(dev)?,
+ cmdq: Cmdq::new(dev)?,
+ rmargs: CoherentAllocation::<GspArgumentsPadded>::alloc_coherent(
+ dev,
+ 1,
+ GFP_KERNEL | __GFP_ZERO,
+ )?,
+ _: {
+ // Initialise the logging structures. The OpenRM equivalents are in:
+ // _kgspInitLibosLoggingStructures (allocates memory for buffers)
+ // kgspSetupLibosInitArgs_IMPL (creates pLibosInitArgs[] array)
+ dma_write!(
+ libos[0] = LibosMemoryRegionInitArgument::new("LOGINIT", &loginit.0)
+ )?;
+ dma_write!(
+ libos[1] = LibosMemoryRegionInitArgument::new("LOGINTR", &logintr.0)
+ )?;
+ dma_write!(libos[2] = LibosMemoryRegionInitArgument::new("LOGRM", &logrm.0))?;
+ dma_write!(rmargs[0].inner = fw::GspArgumentsCached::new(cmdq))?;
+ dma_write!(libos[3] = LibosMemoryRegionInitArgument::new("RMARGS", rmargs))?;
+ },
+ }))
+ })
}
}
diff --git a/drivers/gpu/nova-core/gsp/boot.rs b/drivers/gpu/nova-core/gsp/boot.rs
index 54937606b5b0..be427fe26a58 100644
--- a/drivers/gpu/nova-core/gsp/boot.rs
+++ b/drivers/gpu/nova-core/gsp/boot.rs
@@ -82,7 +82,7 @@ impl super::Gsp {
if frts_status != 0 {
dev_err!(
dev,
- "FWSEC-FRTS returned with error code {:#x}",
+ "FWSEC-FRTS returned with error code {:#x}\n",
frts_status
);
@@ -139,10 +139,7 @@ impl super::Gsp {
let bios = Vbios::new(dev, bar)?;
- let gsp_fw = KBox::pin_init(
- GspFirmware::new(dev, chipset, FIRMWARE_VERSION)?,
- GFP_KERNEL,
- )?;
+ let gsp_fw = KBox::pin_init(GspFirmware::new(dev, chipset, FIRMWARE_VERSION), GFP_KERNEL)?;
let fb_layout = FbLayout::new(chipset, bar, &gsp_fw)?;
dev_dbg!(dev, "{:#x?}\n", fb_layout);
@@ -186,7 +183,7 @@ impl super::Gsp {
);
sec2_falcon.reset(bar)?;
- sec2_falcon.dma_load(bar, &booter_loader)?;
+ sec2_falcon.load(bar, &booter_loader)?;
let wpr_handle = wpr_meta.dma_handle();
let (mbox0, mbox1) = sec2_falcon.boot(
bar,
@@ -241,11 +238,10 @@ impl super::Gsp {
// Obtain and display basic GPU information.
let info = commands::get_gsp_info(&mut self.cmdq, bar)?;
- dev_info!(
- pdev.as_ref(),
- "GPU name: {}\n",
- info.gpu_name().unwrap_or("invalid GPU name")
- );
+ match info.gpu_name() {
+ Ok(name) => dev_info!(pdev.as_ref(), "GPU name: {}\n", name),
+ Err(e) => dev_warn!(pdev.as_ref(), "GPU name unavailable: {:?}\n", e),
+ }
Ok(())
}
diff --git a/drivers/gpu/nova-core/gsp/cmdq.rs b/drivers/gpu/nova-core/gsp/cmdq.rs
index 3991ccc0c10f..46819a82a51a 100644
--- a/drivers/gpu/nova-core/gsp/cmdq.rs
+++ b/drivers/gpu/nova-core/gsp/cmdq.rs
@@ -617,7 +617,7 @@ impl Cmdq {
{
dev_err!(
self.dev,
- "GSP RPC: receive: Call {} - bad checksum",
+ "GSP RPC: receive: Call {} - bad checksum\n",
header.sequence()
);
return Err(EIO);
diff --git a/drivers/gpu/nova-core/gsp/commands.rs b/drivers/gpu/nova-core/gsp/commands.rs
index 0425c65b5d6f..c8430a076269 100644
--- a/drivers/gpu/nova-core/gsp/commands.rs
+++ b/drivers/gpu/nova-core/gsp/commands.rs
@@ -2,7 +2,9 @@
use core::{
array,
- convert::Infallible, //
+ convert::Infallible,
+ ffi::FromBytesUntilNulError,
+ str::Utf8Error, //
};
use kernel::{
@@ -30,7 +32,6 @@ use crate::{
},
},
sbuffer::SBufferIter,
- util,
};
/// The `GspSetSystemInfo` command.
@@ -205,11 +206,27 @@ impl MessageFromGsp for GetGspStaticInfoReply {
}
}
+/// Error type for [`GetGspStaticInfoReply::gpu_name`].
+#[derive(Debug)]
+pub(crate) enum GpuNameError {
+ /// The GPU name string does not contain a null terminator.
+ NoNullTerminator(FromBytesUntilNulError),
+
+ /// The GPU name string contains invalid UTF-8.
+ #[expect(dead_code)]
+ InvalidUtf8(Utf8Error),
+}
+
impl GetGspStaticInfoReply {
- /// Returns the name of the GPU as a string, or `None` if the string given by the GSP was
- /// invalid.
- pub(crate) fn gpu_name(&self) -> Option<&str> {
- util::str_from_null_terminated(&self.gpu_name)
+ /// Returns the name of the GPU as a string.
+ ///
+ /// Returns an error if the string given by the GSP does not contain a null terminator or
+ /// contains invalid UTF-8.
+ pub(crate) fn gpu_name(&self) -> core::result::Result<&str, GpuNameError> {
+ CStr::from_bytes_until_nul(&self.gpu_name)
+ .map_err(GpuNameError::NoNullTerminator)?
+ .to_str()
+ .map_err(GpuNameError::InvalidUtf8)
}
}
diff --git a/drivers/gpu/nova-core/gsp/fw.rs b/drivers/gpu/nova-core/gsp/fw.rs
index caeb0d251fe5..83ff91614e36 100644
--- a/drivers/gpu/nova-core/gsp/fw.rs
+++ b/drivers/gpu/nova-core/gsp/fw.rs
@@ -904,9 +904,21 @@ impl GspArgumentsCached {
// SAFETY: Padding is explicit and will not contain uninitialized data.
unsafe impl AsBytes for GspArgumentsCached {}
+/// On Turing and GA100, the entries in the `LibosMemoryRegionInitArgument`
+/// must all be a multiple of GSP_PAGE_SIZE in size, so add padding to force it
+/// to that size.
+#[repr(C)]
+pub(crate) struct GspArgumentsPadded {
+ pub(crate) inner: GspArgumentsCached,
+ _padding: [u8; GSP_PAGE_SIZE - core::mem::size_of::<bindings::GSP_ARGUMENTS_CACHED>()],
+}
+
+// SAFETY: Padding is explicit and will not contain uninitialized data.
+unsafe impl AsBytes for GspArgumentsPadded {}
+
// SAFETY: This struct only contains integer types for which all bit patterns
// are valid.
-unsafe impl FromBytes for GspArgumentsCached {}
+unsafe impl FromBytes for GspArgumentsPadded {}
/// Init arguments for the message queue.
#[repr(transparent)]
diff --git a/drivers/gpu/nova-core/gsp/sequencer.rs b/drivers/gpu/nova-core/gsp/sequencer.rs
index 2d0369c49092..d6c489c39092 100644
--- a/drivers/gpu/nova-core/gsp/sequencer.rs
+++ b/drivers/gpu/nova-core/gsp/sequencer.rs
@@ -14,12 +14,12 @@ use kernel::{
device,
io::poll::read_poll_timeout,
prelude::*,
+ sync::aref::ARef,
time::{
delay::fsleep,
Delta, //
},
- transmute::FromBytes,
- types::ARef, //
+ transmute::FromBytes, //
};
use crate::{
@@ -121,7 +121,7 @@ impl GspSeqCmd {
};
if data.len() < size {
- dev_err!(dev, "Data is not enough for command");
+ dev_err!(dev, "Data is not enough for command\n");
return Err(EINVAL);
}
@@ -320,7 +320,7 @@ impl<'a> Iterator for GspSeqIter<'a> {
cmd_result.map_or_else(
|_err| {
- dev_err!(self.dev, "Error parsing command at offset {}", offset);
+ dev_err!(self.dev, "Error parsing command at offset {}\n", offset);
None
},
|(cmd, size)| {
@@ -382,7 +382,7 @@ impl<'a> GspSequencer<'a> {
dev: params.dev,
};
- dev_dbg!(sequencer.dev, "Running CPU Sequencer commands");
+ dev_dbg!(sequencer.dev, "Running CPU Sequencer commands\n");
for cmd_result in sequencer.iter() {
match cmd_result {
@@ -390,7 +390,7 @@ impl<'a> GspSequencer<'a> {
Err(e) => {
dev_err!(
sequencer.dev,
- "Error running command at index {}",
+ "Error running command at index {}\n",
sequencer.seq_info.cmd_index
);
return Err(e);
@@ -400,7 +400,7 @@ impl<'a> GspSequencer<'a> {
dev_dbg!(
sequencer.dev,
- "CPU Sequencer commands completed successfully"
+ "CPU Sequencer commands completed successfully\n"
);
Ok(())
}
diff --git a/drivers/gpu/nova-core/nova_core.rs b/drivers/gpu/nova-core/nova_core.rs
index b98a1c03f13d..c1121e7c64c5 100644
--- a/drivers/gpu/nova-core/nova_core.rs
+++ b/drivers/gpu/nova-core/nova_core.rs
@@ -16,7 +16,6 @@ mod gsp;
mod num;
mod regs;
mod sbuffer;
-mod util;
mod vbios;
pub(crate) const MODULE_NAME: &kernel::str::CStr = <LocalModule as kernel::ModuleMetadata>::NAME;
diff --git a/drivers/gpu/nova-core/regs.rs b/drivers/gpu/nova-core/regs.rs
index 82cc6c0790e5..ea0d32f5396c 100644
--- a/drivers/gpu/nova-core/regs.rs
+++ b/drivers/gpu/nova-core/regs.rs
@@ -7,15 +7,21 @@
#[macro_use]
pub(crate) mod macros;
-use kernel::prelude::*;
+use kernel::{
+ prelude::*,
+ time, //
+};
use crate::{
+ driver::Bar0,
falcon::{
DmaTrfCmdSize,
FalconCoreRev,
FalconCoreRevSubversion,
+ FalconEngine,
FalconFbifMemType,
FalconFbifTarget,
+ FalconMem,
FalconModSelAlgo,
FalconSecurityModel,
PFalcon2Base,
@@ -306,6 +312,13 @@ register!(NV_PFALCON_FALCON_DMACTL @ PFalconBase[0x0000010c] {
7:7 secure_stat as bool;
});
+impl NV_PFALCON_FALCON_DMACTL {
+ /// Returns `true` if memory scrubbing is completed.
+ pub(crate) fn mem_scrubbing_done(self) -> bool {
+ !self.dmem_scrubbing() && !self.imem_scrubbing()
+ }
+}
+
register!(NV_PFALCON_FALCON_DMATRFBASE @ PFalconBase[0x00000110] {
31:0 base as u32;
});
@@ -325,6 +338,14 @@ register!(NV_PFALCON_FALCON_DMATRFCMD @ PFalconBase[0x00000118] {
16:16 set_dmtag as u8;
});
+impl NV_PFALCON_FALCON_DMATRFCMD {
+ /// Programs the `imem` and `sec` fields for the given FalconMem
+ pub(crate) fn with_falcon_mem(self, mem: FalconMem) -> Self {
+ self.set_imem(mem != FalconMem::Dmem)
+ .set_sec(if mem == FalconMem::ImemSecure { 1 } else { 0 })
+ }
+}
+
register!(NV_PFALCON_FALCON_DMATRFFBOFFS @ PFalconBase[0x0000011c] {
31:0 offs as u32;
});
@@ -349,6 +370,18 @@ register!(NV_PFALCON_FALCON_ENGINE @ PFalconBase[0x000003c0] {
0:0 reset as bool;
});
+impl NV_PFALCON_FALCON_ENGINE {
+ /// Resets the falcon
+ pub(crate) fn reset_engine<E: FalconEngine>(bar: &Bar0) {
+ Self::read(bar, &E::ID).set_reset(true).write(bar, &E::ID);
+
+ // TIMEOUT: falcon engine should not take more than 10us to reset.
+ time::delay::fsleep(time::Delta::from_micros(10));
+
+ Self::read(bar, &E::ID).set_reset(false).write(bar, &E::ID);
+ }
+}
+
register!(NV_PFALCON_FBIF_TRANSCFG @ PFalconBase[0x00000600[8]] {
1:0 target as u8 ?=> FalconFbifTarget;
2:2 mem_type as bool => FalconFbifMemType;
@@ -380,6 +413,13 @@ register!(NV_PFALCON2_FALCON_BROM_PARAADDR @ PFalcon2Base[0x00000210[1]] {
// PRISCV
+// RISC-V status register for debug (Turing and GA100 only).
+// Reflects current RISC-V core status.
+register!(NV_PRISCV_RISCV_CORE_SWITCH_RISCV_STATUS @ PFalcon2Base[0x00000240] {
+ 0:0 active_stat as bool, "RISC-V core active/inactive status";
+});
+
+// GA102 and later
register!(NV_PRISCV_RISCV_CPUCTL @ PFalcon2Base[0x00000388] {
0:0 halted as bool;
7:7 active_stat as bool;
diff --git a/drivers/gpu/nova-core/util.rs b/drivers/gpu/nova-core/util.rs
deleted file mode 100644
index 4b503249a3ef..000000000000
--- a/drivers/gpu/nova-core/util.rs
+++ /dev/null
@@ -1,16 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-
-/// Converts a null-terminated byte slice to a string, or `None` if the array does not
-/// contains any null byte or contains invalid characters.
-///
-/// Contrary to [`kernel::str::CStr::from_bytes_with_nul`], the null byte can be anywhere in the
-/// slice, and not only in the last position.
-pub(crate) fn str_from_null_terminated(bytes: &[u8]) -> Option<&str> {
- use kernel::str::CStr;
-
- bytes
- .iter()
- .position(|&b| b == 0)
- .and_then(|null_pos| CStr::from_bytes_with_nul(&bytes[..=null_pos]).ok())
- .and_then(|cstr| cstr.to_str().ok())
-}
diff --git a/drivers/gpu/nova-core/vbios.rs b/drivers/gpu/nova-core/vbios.rs
index abf423560ff4..72cba8659a2d 100644
--- a/drivers/gpu/nova-core/vbios.rs
+++ b/drivers/gpu/nova-core/vbios.rs
@@ -11,14 +11,16 @@ use kernel::{
Alignable,
Alignment, //
},
+ sync::aref::ARef,
transmute::FromBytes,
- types::ARef,
};
use crate::{
driver::Bar0,
firmware::{
fwsec::Bcrt30Rsa3kSignature,
+ FalconUCodeDesc,
+ FalconUCodeDescV2,
FalconUCodeDescV3, //
},
num::FromSafeCast,
@@ -790,7 +792,7 @@ impl PciAtBiosImage {
// read the 4 bytes at the offset specified in the token
let offset = usize::from(token.data_offset);
let bytes: [u8; 4] = self.base.data[offset..offset + 4].try_into().map_err(|_| {
- dev_err!(self.base.dev, "Failed to convert data slice to array");
+ dev_err!(self.base.dev, "Failed to convert data slice to array\n");
EINVAL
})?;
@@ -887,11 +889,6 @@ impl PmuLookupTable {
ret
};
- // Debug logging of entries (dumps the table data to dmesg)
- for i in (header_len..required_bytes).step_by(entry_len) {
- dev_dbg!(dev, "PMU entry: {:02x?}\n", &data[i..][..entry_len]);
- }
-
Ok(PmuLookupTable { header, table_data })
}
@@ -1003,20 +1000,11 @@ impl FwSecBiosBuilder {
}
impl FwSecBiosImage {
- /// Get the FwSec header ([`FalconUCodeDescV3`]).
- pub(crate) fn header(&self) -> Result<&FalconUCodeDescV3> {
+ /// Get the FwSec header ([`FalconUCodeDesc`]).
+ pub(crate) fn header(&self) -> Result<FalconUCodeDesc> {
// Get the falcon ucode offset that was found in setup_falcon_data.
let falcon_ucode_offset = self.falcon_ucode_offset;
- // Make sure the offset is within the data bounds.
- if falcon_ucode_offset + core::mem::size_of::<FalconUCodeDescV3>() > self.base.data.len() {
- dev_err!(
- self.base.dev,
- "fwsec-frts header not contained within BIOS bounds\n"
- );
- return Err(ERANGE);
- }
-
// Read the first 4 bytes to get the version.
let hdr_bytes: [u8; 4] = self.base.data[falcon_ucode_offset..falcon_ucode_offset + 4]
.try_into()
@@ -1024,33 +1012,34 @@ impl FwSecBiosImage {
let hdr = u32::from_le_bytes(hdr_bytes);
let ver = (hdr & 0xff00) >> 8;
- if ver != 3 {
- dev_err!(self.base.dev, "invalid fwsec firmware version: {:?}\n", ver);
- return Err(EINVAL);
+ let data = self.base.data.get(falcon_ucode_offset..).ok_or(EINVAL)?;
+ match ver {
+ 2 => {
+ let v2 = FalconUCodeDescV2::from_bytes_copy_prefix(data)
+ .ok_or(EINVAL)?
+ .0;
+ Ok(FalconUCodeDesc::V2(v2))
+ }
+ 3 => {
+ let v3 = FalconUCodeDescV3::from_bytes_copy_prefix(data)
+ .ok_or(EINVAL)?
+ .0;
+ Ok(FalconUCodeDesc::V3(v3))
+ }
+ _ => {
+ dev_err!(self.base.dev, "invalid fwsec firmware version: {:?}\n", ver);
+ Err(EINVAL)
+ }
}
-
- // Return a reference to the FalconUCodeDescV3 structure.
- //
- // SAFETY: We have checked that `falcon_ucode_offset + size_of::<FalconUCodeDescV3>` is
- // within the bounds of `data`. Also, this data vector is from ROM, and the `data` field
- // in `BiosImageBase` is immutable after construction.
- Ok(unsafe {
- &*(self
- .base
- .data
- .as_ptr()
- .add(falcon_ucode_offset)
- .cast::<FalconUCodeDescV3>())
- })
}
/// Get the ucode data as a byte slice
- pub(crate) fn ucode(&self, desc: &FalconUCodeDescV3) -> Result<&[u8]> {
+ pub(crate) fn ucode(&self, desc: &FalconUCodeDesc) -> Result<&[u8]> {
let falcon_ucode_offset = self.falcon_ucode_offset;
// The ucode data follows the descriptor.
let ucode_data_offset = falcon_ucode_offset + desc.size();
- let size = usize::from_safe_cast(desc.imem_load_size + desc.dmem_load_size);
+ let size = usize::from_safe_cast(desc.imem_load_size() + desc.dmem_load_size());
// Get the data slice, checking bounds in a single operation.
self.base
@@ -1066,10 +1055,14 @@ impl FwSecBiosImage {
}
/// Get the signatures as a byte slice
- pub(crate) fn sigs(&self, desc: &FalconUCodeDescV3) -> Result<&[Bcrt30Rsa3kSignature]> {
+ pub(crate) fn sigs(&self, desc: &FalconUCodeDesc) -> Result<&[Bcrt30Rsa3kSignature]> {
+ let hdr_size = match desc {
+ FalconUCodeDesc::V2(_v2) => core::mem::size_of::<FalconUCodeDescV2>(),
+ FalconUCodeDesc::V3(_v3) => core::mem::size_of::<FalconUCodeDescV3>(),
+ };
// The signatures data follows the descriptor.
- let sigs_data_offset = self.falcon_ucode_offset + core::mem::size_of::<FalconUCodeDescV3>();
- let sigs_count = usize::from(desc.signature_count);
+ let sigs_data_offset = self.falcon_ucode_offset + hdr_size;
+ let sigs_count = usize::from(desc.signature_count());
let sigs_size = sigs_count * core::mem::size_of::<Bcrt30Rsa3kSignature>();
// Make sure the data is within bounds.
diff --git a/rust/helpers/drm.c b/rust/helpers/drm.c
index 450b406c6f27..fe226f7b53ef 100644
--- a/rust/helpers/drm.c
+++ b/rust/helpers/drm.c
@@ -5,17 +5,18 @@
#ifdef CONFIG_DRM
-void rust_helper_drm_gem_object_get(struct drm_gem_object *obj)
+__rust_helper void rust_helper_drm_gem_object_get(struct drm_gem_object *obj)
{
drm_gem_object_get(obj);
}
-void rust_helper_drm_gem_object_put(struct drm_gem_object *obj)
+__rust_helper void rust_helper_drm_gem_object_put(struct drm_gem_object *obj)
{
drm_gem_object_put(obj);
}
-__u64 rust_helper_drm_vma_node_offset_addr(struct drm_vma_offset_node *node)
+__rust_helper __u64
+rust_helper_drm_vma_node_offset_addr(struct drm_vma_offset_node *node)
{
return drm_vma_node_offset_addr(node);
}
diff --git a/rust/kernel/drm/driver.rs b/rust/kernel/drm/driver.rs
index f30ee4c6245c..e09f977b5b51 100644
--- a/rust/kernel/drm/driver.rs
+++ b/rust/kernel/drm/driver.rs
@@ -121,7 +121,6 @@ pub trait Driver {
pub struct Registration<T: Driver>(ARef<drm::Device<T>>);
impl<T: Driver> Registration<T> {
- /// Creates a new [`Registration`] and registers it.
fn new(drm: &drm::Device<T>, flags: usize) -> Result<Self> {
// SAFETY: `drm.as_raw()` is valid by the invariants of `drm::Device`.
to_result(unsafe { bindings::drm_dev_register(drm.as_raw(), flags) })?;
@@ -129,8 +128,9 @@ impl<T: Driver> Registration<T> {
Ok(Self(drm.into()))
}
- /// Same as [`Registration::new`}, but transfers ownership of the [`Registration`] to
- /// [`devres::register`].
+ /// Registers a new [`Device`](drm::Device) with userspace.
+ ///
+ /// Ownership of the [`Registration`] object is passed to [`devres::register`].
pub fn new_foreign_owned(
drm: &drm::Device<T>,
dev: &device::Device<device::Bound>,
diff --git a/rust/kernel/drm/gem/mod.rs b/rust/kernel/drm/gem/mod.rs
index a7f682e95c01..d49a9ba02635 100644
--- a/rust/kernel/drm/gem/mod.rs
+++ b/rust/kernel/drm/gem/mod.rs
@@ -210,7 +210,7 @@ impl<T: DriverObject> Object<T> {
// SAFETY: The arguments are all valid per the type invariants.
to_result(unsafe { bindings::drm_gem_object_init(dev.as_raw(), obj.obj.get(), size) })?;
- // SAFETY: We never move out of `Self`.
+ // SAFETY: We will never move out of `Self` as `ARef<Self>` is always treated as pinned.
let ptr = KBox::into_raw(unsafe { Pin::into_inner_unchecked(obj) });
// SAFETY: `ptr` comes from `KBox::into_raw` and hence can't be NULL.
@@ -253,7 +253,7 @@ impl<T: DriverObject> Object<T> {
}
// SAFETY: Instances of `Object<T>` are always reference-counted.
-unsafe impl<T: DriverObject> crate::types::AlwaysRefCounted for Object<T> {
+unsafe impl<T: DriverObject> crate::sync::aref::AlwaysRefCounted for Object<T> {
fn inc_ref(&self) {
// SAFETY: The existence of a shared reference guarantees that the refcount is non-zero.
unsafe { bindings::drm_gem_object_get(self.as_raw()) };
@@ -293,9 +293,7 @@ impl<T: DriverObject> AllocImpl for Object<T> {
}
pub(super) const fn create_fops() -> bindings::file_operations {
- // SAFETY: As by the type invariant, it is safe to initialize `bindings::file_operations`
- // zeroed.
- let mut fops: bindings::file_operations = unsafe { core::mem::zeroed() };
+ let mut fops: bindings::file_operations = pin_init::zeroed();
fops.owner = core::ptr::null_mut();
fops.open = Some(bindings::drm_open);
diff --git a/rust/kernel/page.rs b/rust/kernel/page.rs
index 432fc0297d4a..adecb200c654 100644
--- a/rust/kernel/page.rs
+++ b/rust/kernel/page.rs
@@ -25,14 +25,36 @@ pub const PAGE_SIZE: usize = bindings::PAGE_SIZE;
/// A bitmask that gives the page containing a given address.
pub const PAGE_MASK: usize = !(PAGE_SIZE - 1);
-/// Round up the given number to the next multiple of [`PAGE_SIZE`].
+/// Rounds up to the next multiple of [`PAGE_SIZE`].
///
-/// It is incorrect to pass an address where the next multiple of [`PAGE_SIZE`] doesn't fit in a
-/// [`usize`].
-pub const fn page_align(addr: usize) -> usize {
- // Parentheses around `PAGE_SIZE - 1` to avoid triggering overflow sanitizers in the wrong
- // cases.
- (addr + (PAGE_SIZE - 1)) & PAGE_MASK
+/// Returns [`None`] on integer overflow.
+///
+/// # Examples
+///
+/// ```
+/// use kernel::page::{
+/// page_align,
+/// PAGE_SIZE,
+/// };
+///
+/// // Requested address is already aligned.
+/// assert_eq!(page_align(0x0), Some(0x0));
+/// assert_eq!(page_align(PAGE_SIZE), Some(PAGE_SIZE));
+///
+/// // Requested address needs alignment up.
+/// assert_eq!(page_align(0x1), Some(PAGE_SIZE));
+/// assert_eq!(page_align(PAGE_SIZE + 1), Some(2 * PAGE_SIZE));
+///
+/// // Requested address causes overflow (returns `None`).
+/// let overflow_addr = usize::MAX - (PAGE_SIZE / 2);
+/// assert_eq!(page_align(overflow_addr), None);
+/// ```
+#[inline(always)]
+pub const fn page_align(addr: usize) -> Option<usize> {
+ let Some(sum) = addr.checked_add(PAGE_SIZE - 1) else {
+ return None;
+ };
+ Some(sum & PAGE_MASK)
}
/// Representation of a non-owning reference to a [`Page`].