Natural Language Processing Week 9 NPTEL Assignment Answers 2025

NPTEL Natural Language Processing Week 9 Assignment Answers 2024

1. Which of the following is/are true?

Options:

  • a) Topic modelling discovers the hidden themes that pervade the collection
  • b) Topic modelling is a generative model
  • c) Dirichlet hyperparameter Beta used to represent document-topic Density?
  • d) None of the above

✅ Answer: a, b
Explanation:
Topic modeling uncovers latent themes (a) and uses generative models like LDA (b). Beta relates to topic-word distribution, not document-topic.


2. Which of the following is/are true?

Options:

  • a) The Dirichlet is an exponential family distribution on the simplex positive and negative vectors sum to one
  • b) Correlated Topic Model (CTM) predicts better via correlated topics
  • c) LDA provides better fit than CTM
  • d) CTM draws topic distributions from a logistic normal

✅ Answer: b, d
Explanation:
CTM allows topic correlation using logistic normal (d), offering better predictions than LDA when topics are interrelated (b).


3. You have a topic model with α = 0.89 and β = 0.04. To get sparser word distribution and denser topic distribution, what should be the values for α and β?

Options:

  • a) Both α and β should be decreased
  • b) Both α and β should be increased
  • c) α should be decreased, but β should be increased
  • d) α should be increased, but β should be decreased

✅ Answer: d
Explanation:
Larger α → more evenly mixed topics in documents (denser); smaller β → sparse word distributions per topic.


4. Which of the following is/are false about LDA assumption?

Options:

  • a) LDA assumes that the order of documents matter
  • b) LDA is not appropriate for corpora that spans hundreds of years
  • c) LDA assumes that documents are a mixture of topics and topics are a mixture of words
  • d) LDA can decide on the number of topics by itself.

✅ Answer: a, d
Explanation:
LDA assumes bag-of-words (not sequential) (a), and the number of topics must be predefined (d).


5. Which of the following is/are true about Relational Topic Model (RTM)?

Options:

  • a) RTM uses same latent topic assignments to generate document content
  • b) Link function uses linear regression
  • c) Covariates are constructed by the Hadamard product
  • d) Link probability depends on topic assignments that generated words

✅ Answer: a, c, d
Explanation:
RTM models content and link structure. It uses Hadamard product for features (c), and topic-based link prediction (a, d). It does not use linear regression (b).


6. Classically, topic models are introduced in the text analysis community for ___________ topic discovery.

Options:

  • a) Unsupervised
  • b) Supervised
  • c) Semi-automated
  • d) None of the above

✅ Answer: a
Explanation:
Topic models like LDA are unsupervised, discovering topics without labeled data.


7. Which of the following is/are false about Gibbs Sampling?

Options:

  • a) Gibbs sampling is a form of Markov chain Monte Carlo (MCMC)
  • b) Sampling is done sequentially until convergence
  • c) It cannot estimate the posterior directly
  • d) It is a variational method

✅ Answer: c, d
Explanation:
Gibbs Sampling does estimate posteriors directly (c is false) and is not a variational method (d is false). It is a standard MCMC technique.

8.

Answer: a