abstract

What we want to do

Construct a casual graphical model to predict movie box office

What is Causal Graphical Model

Construct Causal Graph manually depend on past experience

Construct Causal Graph automatically

Motivation

  1. Understand Interactions Between Variables
  2. Discovering Potential Causal Relationships
  3. Data-Driven Experimentation

Understand Interactions Between Variables

Suppose Variables Independent

Budget

Movie Box Office Performance

Genres

Cast

Release Date

Marketing and Promotion

...

  1. Budget: The amount spent on the production of the movie directly impacts the quality of production, marketing, and distribution.
  2. Genres: Different genres attract different audiences and can affect box office performance depending on trends and audience preferences.
  3. Cast: The actors in a movie can draw in audiences, especially if the cast includes popular or well-known stars.
  4. Release Date: Timing plays a critical role, as certain periods (e.g., summer, holidays) typically see higher ticket sales.

Budget

Cast

Marketing and Promotion

Genres

Movie Box Office Performance

Release Date

Age

Exercise

Cholesterol

Exercise

Cholesterol

Mining potential causal relationships between variables

A B 存在因果性, 但不存在相关性

Get counter-intuitive Conclusions

Simpson's Paradox

Data-Driven Experimentation

Multi-linear Regression

Time-Series Analysis

Hybrid model Analysis

References

  1. Wooldridge, J. M. (2016). Introductory econometrics: A modern approach (6th ed.). Cengage Learning.

  2. Weisberg, S. (2005). Applied linear regression (3rd ed.). Wiley.

  3. Box, G. E., Jenkins, G. M., Reinsel, G. C., & Ljung, G. M. (2015). Time series analysis: Forecasting and control (5th ed.). Wiley.

  4. Hamilton, J. D. (1994). Time series analysis. Princeton University Press.

  5. Tsai, C. F., & Hung, C. S. (2014). Modeling credit scoring using neural network ensembles. Neural Computing and Applications, 24(2), 445–452. https://doi.org/10.1007/s00521-012-1247-4

  6. Zhang, G. P. (2003). Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing, 50, 159-175. https://doi.org/10.1016/S0925-2312(01)00702-0