Spring Sale 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: Board70

NCP-AII Exam Dumps - NVIDIA-Certified Professional Questions and Answers

Question # 4

During cluster validation, the Cable Validation Tool (CVT) reports "Underperforming (BER)" for an InfiniBand link. Which BER thresholds indicate a critical signal quality issue requiring cable replacement?

Options:

A.

Rx power variance > 3dB between lanes

B.

Effective BER > 0 during the first 125 minutes of link operation

C.

Raw BER > 1e-12 or Effective BER > 1.5E-254 for <6hr measurements

D.

Temperature > 85°C on transceiver module

Buy Now
Question # 5

A system administrator needs to install a GPU/DPU in a server. The server has a free PCI-e slot, there are enough free PCI-e lanes, and there is enough room for the card. Which procedure should be followed?

Options:

A.

Ensure the server has enough power. Verify compatibility of cables with server's platform. Make sure the server is down to remove cables safely. Do not wear an ESD bracelet.

B.

Ensure the server has enough power. Make sure the server is down to remove cables safely. Wear an ESD bracelet.

C.

Ensure the server has enough power. Make sure the server is up and running with attached cables. Wear an ESD bracelet.

D.

Ensure the server has enough power. Verify compatibility of cables with server's platform. Make sure the server is down to remove cables safely. Wear an ESD bracelet.

Buy Now
Question # 6

A system administrator needs to validate a GPU-based server and ensure that no errors occur under load. What command should be used?

Options:

A.

nvsm dump health

B.

stress-test --usage

C.

nvsm show health

D.

nvsm stress-test

Buy Now
Question # 7

During multi-node HPL burn-in, GPUs show uneven utilization. Which configuration ensures balanced workload distribution?

Options:

A.

Enable HPL_USE_NVSHMEM=1 for shared memory acceleration

B.

HPL_RUN_GEMM_TESTS to skip validation

C.

Set --gpu-affinity and --cpu-affinity to align GPU and NUMA nodes

D.

HPL_OOC_TILE_M to 8192 for larger blocks

Buy Now
Question # 8

To validate bisectional bandwidth across two racks in a Spectrum-X Ethernet fabric, which NCCL test configuration isolates East-West traffic?

Options:

A.

NCCL_TESTS_SPLIT="OR 0x7" ./all_reduce_perf -g 8

B.

Run without splits and analyze per-rack averages.

C.

NCCL_TESTS_SPLIT="MOD 2" ./all_reduce_perf -g 8

D.

NCCL_TESTS_SPLIT="DIV 8" ./all_reduce_perf -g 1

Buy Now
Question # 9

A user encounters "permission denied" errors when running GPU-accelerated containers on a Secure Boot-enabled system. What resolves this?

Options:

A.

Enroll the MOK and sign NVIDIA kernel modules.

B.

Reinstall Docker without the NVIDIA runtime.

C.

Disable SELinux to relax unnecessary security policies.

D.

Run Docker with sudo for elevated privileges.

Buy Now
Question # 10

A user wants to restrict a Docker container to use only GPUs 0 and 2. Which command achieves this?

Options:

A.

docker run --gpus '"device=0,2"' nvidia/cuda:12.1-base nvidia-smi

B.

docker run -e NVIDIA_VISIBLE_DEVICES=0,2 nvidia/cuda:12.1-base nvidia-smi

C.

docker run --gpus all nvidia/cuda:12.1-base nvidia-smi -id=0,2

D.

docker run --device /dev/nvidia0,/dev/nvidia2 nvidia/cuda:12.1-base nvidia-smi

Buy Now
Question # 11

A media company is developing an AI platform for video content analysis that requires storing and processing large volumes of unstructured video data. The platform must support high throughput for data ingestion and provide efficient access for real-time analytics. Given these requirements, which storage strategy should the company implement?

Options:

A.

Tape storage for its cost-effectiveness and archival capabilities

B.

Block storage for low latency and high performance

C.

File storage for hierarchical organization and easy navigation

D.

Object storage for scalability and metadata management

Buy Now
Question # 12

A system administrator receives an alert about a potential hardware fault on an NVIDIA DGX A100. The GPU performance seems degraded, and the system fans are operating loudly. What step should be recommended to identify and troubleshoot the hardware fault?

Options:

A.

Run a deep learning workload to stress test the GPUs and check whether the issue persists.

B.

Check the NVIDIA System Management Interface (nvidia-smi) for GPU status and temperatures.

C.

Power drain then restart the DGX and check if the performance degradation resolves.

D.

Increase the fan speed to maximum and check whether the performance improves.

Buy Now
Question # 13

Why is it important to provide a large and high-performance local cache (using SSDs configured as RAID-0) for deep learning workloads on DGX systems?

Options:

A.

Local SSD cache allows users to increase the number of NFS threads on the server without impacting storage reliability.

B.

Using local SSD cache in RAID-0 enables direct GPU access to files without host CPU involvement, further boosting performance.

C.

Local SSD cache in RAID-0 is necessary to provide redundancy in case one of the drives fails during long training runs.

D.

A local SSD cache in RAID-0 ensures that most training data is read only once from the network, significantly reducing NFS traffic.

Buy Now
Exam Code: NCP-AII
Exam Name: NVIDIA AI Infrastructure
Last Update: Feb 28, 2026
Questions: 71
NCP-AII pdf

NCP-AII PDF

$25.5  $84.99
NCP-AII Engine

NCP-AII Testing Engine

$28.5  $94.99
NCP-AII PDF + Engine

NCP-AII PDF + Testing Engine

$40.5  $134.99