Fiszki

DeepLearning_GPT3_questions

Test w formie fiszek
Ilość pytań: 85 Rozwiązywany: 610 razy
What is the purpose of the peephole connections in a peephole LSTM?
To allow the gates to observe the input sequence directly
To allow the gates to observe the current cell state directly
To allow the gates to adjust their activations based on the previous time step
To allow the gates to observe the previous hidden state directly
To allow the gates to observe the current cell state directly

c) To allow the gates to observe the current cell state directly. The peephole connections in a peephole LSTM allow the gates to directly observe the current

Which of the following is true about the output of an LSTM cell?
The output is determined by the forget gate
The output is always the same as the hidden state
The output is a function of both the hidden state and the cell state
The output is determined by the output gate
The output is a function of both the hidden state and the cell state

d) The output is a function of both the hidden state and the cell state. While the hidden state represents a summary of the information processed by the LSTM cell up to the current time step, the output is computed using a combination of the hidden state and the cell state, with the output gate controlling the relative contributions of each. The output gate also determines which elements of the hidden state should be passed on to the next time step.

What is the difference between a standard LSTM cell and a peephole LSTM cell?
Peephole LSTM cells do not have an output gate
Peephole LSTM cells have an extra forget gate
Peephole LSTM cells have additional connections from the cell state to the gates
Peephole LSTM cells use a different activation function
Peephole LSTM cells have additional connections from the cell state to the gates

b) Peephole LSTM cells have additional connections from the cell state to the gates. In addition to the input, forget, and output gates used in a standard LSTM cell, peephole LSTM cells also have connections from the cell state to each of the gates. This allows the gates to directly observe the cell state and make more informed decisions about how to update it.

Which of the following is an advantage of using LSTM over traditional recurrent neural networks (RNNs)?
LSTMs can handle variable-length sequences
LSTMs converge faster than RNNs
LSTMs are less prone to overfitting than RNNs
LSTMs require fewer parameters than RNNs

a) LSTMs can handle variable-length sequences. LSTMs are designed to address the vanishing gradient problem that occurs in traditional RNNs, which makes it difficult for them to capture long-term dependencies in sequences. LSTMs achieve this by using a memory cell and various gates to control the flow of information through the cell. This makes LSTMs well-suited for handling variable-length sequences, which is a common use case in many applications.

What is the purpose of the forget gate in an LSTM cell?
To determine the output of the LSTM cell
To control how much of the cell state is updated
To decide whether to update the cell state or not
To determine the input to the output gate
To decide whether to update the cell state or not

d) To decide whether to update the cell state or not. The forget gate takes as input the concatenation of the previous hidden state and the current input, and outputs a value between 0 and 1 for each element in the cell state vector. This value determines how much of the corresponding element in the previous cell state should be kept or forgotten.

Which of the following is NOT a type of gate in an LSTM?
Output gate
Forget gate
Update gate
Input gate
Update gate

d) Update gate is not a type of gate in an LSTM. The three types of gates are forget gate, input gate, and output gate. The update of the cell state is controlled by the forget and input gates, which determine what information should be retained and added to the cell state, respectively.

What is the purpose of the teacher forcing technique in training RNNs?
To provide the network with the correct input at each time step during training.
To speed up the convergence of the network.
To prevent overfitting.
To improve the generalization ability of the network.
To provide the network with the correct input at each time step during training.

d) To provide the network with the correct input at each time step during training. In the teacher forcing technique, the correct output from the previous time step is used as input to the network at the current time step during training, instead of using the output predicted by the network at the previous time step. This is done to ensure that the network receives the correct input during training and can learn to make accurate predictions.

What is the difference between a unidirectional and bidirectional RNN?
Both unidirectional and bidirectional RNNs can only process data in one direction.
Both unidirectional and bidirectional RNNs can process data in both directions
A unidirectional RNN can only process data in one direction, while a bidirectional RNN can process data in both directions.
A unidirectional RNN can process data in both directions, while a bidirectional RNN can only process data in one direction.
A unidirectional RNN can only process data in one direction, while a bidirectional RNN can process data in both directions.

a) A unidirectional RNN can only process data in one direction, while a bidirectional RNN can process data in both directions. In a unidirectional RNN, information flows only from the past to the future, whereas in a bidirectional RNN, information flows in both directions, allowing the network to take into account past and future context when making predictions.

What is the long short-term memory (LSTM) architecture designed to address in RNNs?
The underfitting problem.
The exploding gradient problem.
The overfitting problem.
The vanishing gradient problem.
The vanishing gradient problem.

a) The vanishing gradient problem. The LSTM architecture is designed to address the issue of the gradients becoming too small during backpropagation in RNNs. It uses a memory cell and three gates (input, output, and forget) to selectively remember or forget information over time.

What is the vanishing gradient problem in recurrent neural networks (RNNs)?
The weights of the network become too large.
The gradients become too small during backpropagation.
The gradients become too large during backpropagation.
The weights of the network become too small.
The gradients become too small during backpropagation.

b) The gradients become too small during backpropagation. The vanishing gradient problem refers to the issue of the gradients becoming extremely small as they are backpropagated through many time steps in an RNN. This can make it difficult for the network to learn long-term dependencies.

Which of the following is a potential application of UNet in medical image analysis?
Detecting anomalies in a time series
All of the above
Segmenting tumor regions in an MRI
Identifying objects in an image
Segmenting tumor regions in an MRI

c) Segmenting tumor regions in an MRI. UNet has been used in various medical image analysis tasks, including segmenting tumor regions in MRIs and identifying different types of cells in histology images.

How does UNet handle class imbalance in image segmentation tasks?
By oversampling the minority classes
By weighting the loss function for underrepresented classes
By undersampling the majority classes
UNet does not handle class imbalance
By weighting the loss function for underrepresented classes

a) By weighting the loss function for underrepresented classes. UNet typically uses a modified cross-entropy loss function that weights the contribution of each pixel to the loss based on the frequency of its class in the training data.

Which operation is used in UNet to recover the image resolution?
Upsampling
Pooling
Dropout
Convolution
Upsampling

Upsampling. The expansive path in UNet uses upsampling and transposed convolution operations to recover the image resolution.

Which part of UNet captures the context of an image?
The expansive path
The contracting path
All of the above
The bottleneck layer
The contracting path
What type of deep learning task is UNet commonly used for?
Image classification
Object detection
Image segmentation
Natural language processing
Image segmentation

) Image segmentation. UNet is often used for segmenting images into different regions, such as identifying different types of cells in a medical image.

Which of the following is an application of autoencoders?
Dimensionality reduction
Image denoising
Anomaly detection
All of the above
Image denoising

d) All of the above Explanation: Autoencoders have a wide range of applications, including image denoising, dimensionality reduction, and anomaly detection, among others.

What is the difference between a denoising autoencoder and a standard autoencoder?
Denoising autoencoders use noisy data as input during training.
Denoising autoencoders use a different cost function.
Denoising autoencoders use a different activation function.
Denoising autoencoders have an additional noise reduction layer.
Denoising autoencoders use noisy data as input during training.

d) Denoising autoencoders use noisy data as input during training.

Explanation: Denoising autoencoders are trained using noisy data as input, with the objective of reconstructing the original, noise-free input. This helps the autoencoder to learn more robust representations.

What is the bottleneck layer in an autoencoder?
The last hidden layer in the encoder network.
The first hidden layer in the encoder network.
The first hidden layer in the decoder network.
The last hidden layer in the decoder network.
The last hidden layer in the encoder network.

b) The last hidden layer in the encoder network.

Explanation: The bottleneck layer is the last hidden layer in the encoder network, which contains a compressed representation of the input.

Which of the following is a type of regularized autoencoder?
All of the above
Sparse autoencoder
Contractive autoencoder
Denoising autoencoder
All of the above

d) All of the above

Explanation: Denoising autoencoders, contractive autoencoders, and sparse autoencoders are all examples of regularized autoencoders.

What is the objective of a variational autoencoder (VAE)?
To maximize the lower bound on the log-likelihood of the data.
To minimize the reconstruction error between the input and the output.
To maximize the likelihood of the data under the encoder distribution.
To minimize the distance between the true data distribution and the learned distribution.
To maximize the lower bound on the log-likelihood of the data.

d) To maximize the lower bound on the log-likelihood of the data. Explanation: The objective of a VAE is to maximize the lower bound on the log-likelihood of the data, which is also known as the Evidence Lower BOund (ELBO).

Which of the following pooling methods is designed to capture spatial context in the feature maps?
Average pooling
Global pooling
Spatial pyramid pooling
Max pooling
Max pooling

d) Spatial pyramid pooling. Spatial pyramid pooling divides the feature maps into a pyramid of pooling regions, and pools each region using max or average pooling. This allows the model to capture information about the spatial context of the features at multiple scales.

What is the main disadvantage of using pooling layers in convolutional neural networks?
They can increase the size of the output volume
They can lead to overfitting
They can make the model more computationally expensive
They can reduce the representational capacity of the model
They can reduce the representational capacity of the model

b) They can reduce the representational capacity of the model. Pooling layers can discard information from the feature maps and reduce the spatial resolution of the output volume, which can limit the ability of the model to capture fine-grained details.

Which of the following statements about max pooling is true?
It can be used to learn translation invariance
It always produces a smaller output volume than the input volume
It performs the same operation on all feature maps in a given layer
) It can be used as an alternative to fully connected layers
It performs the same operation on all feature maps in a given layer

c) It performs the same operation on all feature maps in a given layer. Max pooling takes the maximum value in each pooling region across all feature maps in the layer, and this operation is applied uniformly to all feature maps.

Which of the following pooling methods does not involve any parameter learning?
Global pooling
L2 pooling
Average pooling
Max pooling
Global pooling

d) Global pooling. Global pooling takes the entire feature map as input and outputs a single value, without any parameter learning.

What is the purpose of pooling layers in convolutional neural networks?
To increase the size of the feature maps
To increase the number of parameters in the model
To add non-linearities to the model
To reduce the spatial dimensions of the output volume
To reduce the spatial dimensions of the output volume

b) To reduce the spatial dimensions of the output volume. Pooling layers are used to downsample the spatial dimensions of the feature maps, which reduces the number of parameters in the model and helps to prevent overfitting.

Powiązane tematy

Inne tryby