Pre-Summer Sale 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: Board70

NCP-AII Exam Dumps - NVIDIA-Certified Professional Questions and Answers

Question # 14

After ClusterKit reports " GPU-Host latency exceeds threshold, " which NVIDIA diagnostic tool should be used to isolate hardware faults?

Options:

A.

Re-run ClusterKit with --stress=gpu -Y 60 to extend test duration

B.

nvidia-smi topo -m to inspect GPU topology connections

C.

DCGM Diags dcgmi diag -r 2

D.

ib_write_bw to measure InfiniBand bandwidth between nodes

Buy Now
Question # 15

A system administrator receives an alert about a potential hardware fault on an NVIDIA DGX A100. The GPU performance seems degraded, and the system fans are operating loudly. What step should be recommended to identify and troubleshoot the hardware fault?

Options:

A.

Run a deep learning workload to stress test the GPUs and check whether the issue persists.

B.

Check the NVIDIA System Management Interface (nvidia-smi) for GPU status and temperatures.

C.

Power drain then restart the DGX and check if the performance degradation resolves.

D.

Increase the fan speed to maximum and check whether the performance improves.

Buy Now
Question # 16

During East-West fabric validation on a 64-GPU cluster, an engineer runs all_reduce_perf and observes an algorithm bandwidth of 350 GB/s and bus bandwidth of 656 GB/s. What does this indicate about the fabric performance?

Options:

A.

Inconclusive; rerun with point-to-point tests.

B.

Optimal performance; bus bandwidth near theoretical peak for NDR InfiniBand.

C.

Critical failure; bus bandwidth exceeds hardware capabilities.

D.

Suboptimal performance; algorithm bandwidth should match bus bandwidth.

Buy Now
Question # 17

After configuring NGC CLI with ngc config set, a user receives ”Authentication failed” errors when pulling containers. What step was most likely omitted?

Options:

A.

Installing the CLI with apt-get instead of manual extraction.

B.

Entering the API key during ngc config set or storing it in ~/.ngc/config.

C.

Setting --format_type=json to enable API interactions.

D.

Running sudo systemctl restart docker after configuration.

Buy Now
Question # 18

A system engineer needs to set the vGPU scheduling behavior for all GPUs to share the scheduling equally with the default time slice length. What command should be used?

Options:

A.

esxcli system module parameters set -m nvidia -p " NVreg_RegistryDwords=RmPVMRL=0x01 "

B.

esxcli graphics module parameters set -m nvidia -p " NVreg_RegistryDwords=RmPVMRL=0x01 "

C.

esxcli system module parameters set -m nvidia -p " NVreg_RegistryDwords=FRL=0x01 "

D.

esxcli system module parameters set -m nvidia -p " NVreg_RegistryDwords=RmPVMRL=0x00 "

Buy Now
Question # 19

A system administrator needs to configure a BlueField DPU and enable RShim on the baseboard management controller (BMC). Which command should be executed?

Options:

A.

ipmitool raw 0x32 0x6a 1

B.

systemctl restart rshim

C.

systemctl enable bmc-rshim.service

D.

scp < path_to_bfb > root@ < bmc_ip > :/dev/rshim0/boot

Buy Now
Question # 20

Which statement best explains why maintaining high cable signal quality is essential in modern high-speed data centers?

Options:

A.

High cable signal quality ensures that cable length and connector type do not play as big a role in deploying new infrastructure in the data center.

B.

High cable signal quality minimizes bit error rates and supports reliable, high-throughput communication, reducing retransmissions and congestion across the network.

C.

High cable signal quality reduces electromagnetic interference (EMI) and crosstalk, helping prevent unexpected packet drops during sustained workloads.

D.

High cable signal quality enables effective use of Forward Error Correction (FEC), which is required for reliable operation at high data rates such as 200GbE and above.

Buy Now
Question # 21

You are expanding a DGX-based deep learning cluster to train on large, high-resolution images that cannot fit into local cache. Multiple nodes will access this data concurrently and require high performance. Which storage and networking solution best meets these requirements?

Options:

A.

Increase the SSD RAID-0 local cache size in each node so it can absorb most training data, making network storage type and speed less important for performance.

B.

Implement a standard NFS server on a 10GbE network because the cluster can access the export and job performance will not be impacted.

C.

Deploy a high-performance parallel file system across InfiniBand or 40/100GbE, ensuring at least 3 GB/s per node and scalable aggregate bandwidth for all cluster workloads.

D.

Recommend general-purpose object storage for all training data because it is optimized for deep learning workloads and distributed data access at any scale.

Buy Now
Question # 22

An engineer is tasked with configuring Out-of-Band management for a DGX BasePOD deployment. Which network design will best ensure secure and reliable Out-of-Band management operations?

Options:

A.

Use a single VLAN for both Out-of-Band management and compute fabric to simplify network design.

B.

Configure Out-of-Band management interfaces to be accessible from any subnet within the data center for maximum flexibility.

C.

Connect Out-of-Band management ports to the same switch as user traffic for easier troubleshooting.

D.

Place all BMC and management interfaces on an isolated Out-of-Band network with access restricted by firewall rules.

Buy Now
Question # 23

After NCCL burn-in reports " transport retry count exceeded, " which corrective action addresses the underlying fabric issue?

Options:

A.

Switch from Ring to Tree algorithms via NCCL_ALGO=TREE

B.

Reduce message size to decrease network utilization

C.

Increase NCCL_IB_TIMEOUT to tolerate longer latencies

D.

Inspect InfiniBand link quality metrics (BER, symbol errors) and replace faulty cables

Buy Now
Exam Code: NCP-AII
Exam Name: NVIDIA AI Infrastructure
Last Update: May 30, 2026
Questions: 123
NCP-AII pdf

NCP-AII PDF

$25.5  $84.99
NCP-AII Engine

NCP-AII Testing Engine

$28.5  $94.99
NCP-AII PDF + Engine

NCP-AII PDF + Testing Engine

$40.5  $134.99