class: center, middle, inverse, title-slide # Graphiques sous R et Python dans Rstudio ### Sebastien Foulle ### 25/01/2019 --- <style> .python{ background-color: gold !important; } pre { white-space: pre-wrap; box-shadow: 10px 5px 5px lightblue; } </style> ## Contenu du document On presente dans les slides suivantes les graphiques usuels en R avec *ggplot2* et *patchwork*, et en Python avec *seaborn*, *matplotlib* et *pandas*. Prerequis pour produire cette presentation **xaringan** avec Rstudio : - une distribution Python (Anaconda convient tres bien) - le package R *reticulate* qui permet d'utiliser des objets R dans des chunks Python (syntaxe *r.mon_objet_r*) et inversement (syntaxe *py$mon_objet_python*) --- ## Chargement des packages et donnees .pull-left[ ```r library("dplyr") library("ggplot2") # pour centrer les titres des graphiques ggplot2 theme_update(plot.title = element_text(hjust = 0.5)) # pour la grille de graphiques library("patchwork") dtf = reshape2::tips %>% group_by(size, sex) %>% summarise(tip = mean(tip)) %>% ungroup dir.create("images") library("reticulate") # chemin vers l'executable python use_python("C:/Users/Sebastien/Anaconda3/python.exe") ``` ```python import matplotlib.pyplot as plt import seaborn as sns sns.set() ``` ] --- ## Les donnees .pull-left[ <p style="text-align:center">
</p> ```r dtf ``` ``` # A tibble: 12 x 3 size sex tip <int> <fct> <dbl> 1 1 Female 1.276667 2 1 Male 1.92 3 2 Female 2.528448 4 2 Male 2.614184 5 3 Female 3.25 6 3 Male 3.476667 7 4 Female 4.021111 8 4 Male 4.172143 9 5 Female 5.14 10 5 Male 3.75 11 6 Female 4.6 12 6 Male 5.85 ``` ] .pull-right[ <p style="text-align:center">
</p> ```python dtf = r.dtf; dtf.head(4) ``` ``` size sex tip 0 1 Female 1.276667 1 1 Male 1.920000 2 2 Female 2.528448 3 2 Male 2.614184 ``` ```python dtf_transposees = dtf.pivot(index='size', columns='sex', values='tip'); dtf_transposees.head(4) ``` ``` sex Female Male size 1 1.276667 1.920000 2 2.528448 2.614184 3 3.250000 3.476667 4 4.021111 4.172143 ``` ] --- ## Courbes .pull-left[ <p style="text-align:center">
</p> ```r ggplot(dtf, aes(size, tip, col = sex)) + geom_line() ``` <img src="hebdor_R_Python_graphiques_files/figure-html/unnamed-chunk-6-1.png" width="450px" height="350px" style="display: block; margin: auto;" /> ] .pull-right[ <p style="text-align:center">
</p> ```python sns.lineplot(x="size", y="tip", hue="sex", data=dtf) plt.show() ``` <img src="images/courbes.jpg" width="400px" height="300px" style="display: block; margin: auto;" /> ] --- ## Nuage de points .pull-left[ <p style="text-align:center">
</p> ```r ggplot(dtf, aes(size, tip, col = sex)) + geom_point() ``` <img src="hebdor_R_Python_graphiques_files/figure-html/unnamed-chunk-10-1.png" width="400px" height="300px" style="display: block; margin: auto;" /> ] .pull-right[ <p style="text-align:center">
</p> ```python sns.scatterplot(x="size", y="tip", hue="sex", data=dtf) # ou : sns.lmplot("size", "tip", hue="sex", data=dtf) plt.show() ``` <img src="images/nuage.jpg" width="400px" height="300px" style="display: block; margin: auto;" /> ] --- ## Pavage hexagonal .pull-left[ <p style="text-align:center">
</p> ```r ggplot(dtf, aes(size, tip)) + geom_hex(bins = 6) ``` <img src="hebdor_R_Python_graphiques_files/figure-html/unnamed-chunk-14-1.png" width="400px" height="300px" style="display: block; margin: auto;" /> ] .pull-right[ <p style="text-align:center">
</p> ```python plt.hexbin(dtf['size'], dtf['tip'], gridsize=(5,5), cmap="Blues") plt.xlabel("size"); plt.ylabel("tip"); plt.title("Avec matplotlib"); plt.colorbar(); plt.show() ``` <img src="images/pavage.jpg" width="400px" height="300px" style="display: block; margin: auto;" /> ] --- ## Diagramme en barres groupees .pull-left[ <p style="text-align:center">
</p> ```r ggplot(dtf, aes(size, tip, fill = sex)) + geom_col(position = "dodge") ``` <img src="hebdor_R_Python_graphiques_files/figure-html/unnamed-chunk-18-1.png" width="400px" height="300px" style="display: block; margin: auto;" /> ] .pull-right[ <p style="text-align:center">
</p> ```python sns.barplot(x="size", y="tip", hue="sex", data=dtf) plt.show() ``` <img src="images/barplot_groupe.jpg" width="400px" height="300px" style="display: block; margin: auto;" /> ] --- ## Diagramme en barres empilees .pull-left[ <p style="text-align:center">
</p> ```r ggplot(dtf, aes(size, tip, fill = sex)) + geom_col() ``` <img src="hebdor_R_Python_graphiques_files/figure-html/unnamed-chunk-22-1.png" width="400px" height="300px" style="display: block; margin: auto;" /> ] .pull-right[ <p style="text-align:center">
</p> ```python dtf_transposees.plot.bar(stacked="True", rot = 0) plt.title("Avec pandas"); plt.show() ``` <img src="images/barplot_empilees.jpg" width="400px" height="300px" style="display: block; margin: auto;" /> ] --- ## Histogramme .pull-left[ <p style="text-align:center">
</p> ```r ggplot(dtf, aes(tip)) + geom_histogram(bins = 4) ``` <img src="hebdor_R_Python_graphiques_files/figure-html/unnamed-chunk-26-1.png" width="400px" height="300px" style="display: block; margin: auto;" /> ] .pull-right[ <p style="text-align:center">
</p> ```python dtf.tip.hist(bins = 4) # ou : sns.distplot(dtf.tip) plt.title("Avec pandas"); plt.show() ``` <img src="images/histo.jpg" width="400px" height="300px" style="display: block; margin: auto;" /> ] --- ## Courbes de densite .pull-left[ <p style="text-align:center">
</p> ```r ggplot(dtf, aes(tip, fill= sex)) + geom_density(alpha= 0.4) ``` <img src="hebdor_R_Python_graphiques_files/figure-html/unnamed-chunk-30-1.png" width="400px" height="300px" style="display: block; margin: auto;" /> ] .pull-right[ <p style="text-align:center">
</p> ```python sns.kdeplot(data= dtf, x= 'tip', hue= 'sex', shade= True) plt.show() ``` <img src="images/densite.jpg" width="400px" height="300px" style="display: block; margin: auto;" /> ] --- ## Boxplots .pull-left[ <p style="text-align:center">
</p> ```r ggplot(dtf, aes(sex, tip, fill = sex)) + geom_boxplot() ``` <img src="hebdor_R_Python_graphiques_files/figure-html/unnamed-chunk-34-1.png" width="400px" height="300px" style="display: block; margin: auto;" /> ] .pull-right[ <p style="text-align:center">
</p> ```python sns.boxplot(x="sex", y="tip", data=dtf) plt.show() ``` <img src="images/boxplots.jpg" width="400px" height="300px" style="display: block; margin: auto;" /> ] --- ## Faceting .pull-left[ <p style="text-align:center">
</p> ```r ggplot(dtf, aes(size, tip))+geom_line()+facet_grid(. ~ sex) ``` <img src="hebdor_R_Python_graphiques_files/figure-html/unnamed-chunk-38-1.png" width="400px" height="300px" style="display: block; margin: auto;" /> ] .pull-right[ <p style="text-align:center">
</p> ```python sns.FacetGrid(dtf,col="sex").map(sns.lineplot,"size","tip") plt.show() ``` <img src="images/facettes.jpg" width="400px" height="300px" style="display: block; margin: auto;" /> ] --- ## Grille .pull-left[ <p style="text-align:center">
</p> ```r ggplot(dtf, aes(size, tip, col = sex)) + geom_point() + ggtitle("nuage") | ggplot(dtf, aes(size, tip, fill = sex)) + geom_col(position = "dodge") + ggtitle("barres") ``` <img src="hebdor_R_Python_graphiques_files/figure-html/unnamed-chunk-42-1.png" width="550px" height="300px" style="display: block; margin: auto;" /> ] .pull-right[ <p style="text-align:center">
</p> ```python f, axes = plt.subplots(1, 2, figsize=(15, 10)) axes[0].set_title('nuage');axes[1].set_title('barres') sns.barplot(x='size',y='tip',hue='sex',data=dtf,ax=axes[0]) sns.scatterplot(x='size',y='tip',hue='sex',data=dtf,ax=axes[1]) plt.show() ``` <img src="images/grille.jpg" width="400px" height="250px" style="display: block; margin: auto;" /> ]