Neural Network | ML | DA Practice

Question

Neural Network | ML | DA Practice

250 views

You have a single hidden-layer neural network for a binary classification task. The input is $X \in \mathbb{R}^{n \times m}$, output $\hat{y} \in \mathbb{R}^{1 \times m}$, and true label $y \in \mathbb{R}^{1 \times m}$. The forward propagation equations are: \[ \begin{align*} z^{[1]} & = W^{[1]}X + b^{[1]} \\ a^{[1]} & = \sigma(z^{[1]}) \\ \hat{y} & = a^{[1]} \\ J & = -\frac{1}{m} \sum_{i=1}^{m} \left( y^{(i)} \log(\hat{y}[i]) + (1 - y^{(i)}) \log(1 - \hat{y}[i]) \right) \end{align*} \] Write the expression for $\frac{\partial J}{\partial W^{[1]}}$ as a matrix product of two terms.

A) $\frac{\partial J}{\partial W^{[1]}} = X \cdot (\hat{y} - y)^T$

B) $\frac{\partial J}{\partial W^{[1]}} = (\hat{y} - y) \cdot X^T$

C) $\frac{\partial J}{\partial W^{[1]}} = X^T \cdot (\hat{y} - y)$

D) $\frac{\partial J}{\partial W^{[1]}} = (\hat{y} - y) \cdot \sigma'(z^{[1]}) \cdot X^T$

rajveer43 asked Jan 27

rajveer43

250 views

See all

1 Answer

See all

Related questions

232

views

1 answers

0 votes

rajveer43 asked Jan 27

232 views

ML | DA Practice Questions

What is Error Analysis?(i) The process of analyzing the performance of a model through metrics such as precision, recall or F1-score.(ii) The process of scanning mis-clas...

rajveer43

232 views

rajveer43 asked Jan 27

563

views

1 answers

0 votes

rajveer43 asked Jan 14

563 views

DA Practice | UPENN | ML | Naive Bais

Suppose you have a three-class problem where class label $ y \in \{0, 1, 2\} $, and each training example $ \mathbf{X} $ has 3 binary attributes \( X_1, X_2, X_3 \in ...

rajveer43

563 views

rajveer43 asked Jan 14

341

views

1 answers

0 votes

rajveer43 asked Jan 13

341 views

UPENN | ML | DA Practice | Regularization

After applying a regularization penalty in linear regression, you find that some of the coefficients of $w$ are zeroed out. Which of the following penalties might have be...

rajveer43

341 views

rajveer43 asked Jan 13

173

views

0 answers

0 votes

rajveer43 asked Jan 13

173 views

UPENN | ML | DA Practice

Using the same data as above $ \mathbf{X} = [-3, 5, 4] $ and $ \mathbf{Y} = [-10, 20, 20] $, assuming a ridge penalty $ \lambda = 50 $, what ratio versus the MLE es...

rajveer43

173 views

rajveer43 asked Jan 13

correct there is nothing is hard in this question if somebody has gone through the mathematical derivation of the network, but don’t think DA paper would have this type of question — rajveer43, Jan 28

tags	tag:apple
author	user:martin
title	title:apple
content	content:apple
exclude	-tag:apple
force match	+apple
views	views:100
score	score:10
answers	answers:2
is accepted	isaccepted:true
is closed	isclosed:true

Neural Network | ML | DA Practice

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Related questions

0 reply

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

1 1 comment reply

Please log in or register to add a comment.

Related questions

0

1 1 comment