<label id="jgr5k"></label>
    <legend id="jgr5k"><track id="jgr5k"></track></legend>

    <sub id="jgr5k"></sub>
  1. <u id="jgr5k"></u>
      久草国产视频,91资源总站,在线免费看AV,丁香婷婷社区,久久精品99久久久久久久久,色天使av,无码探花,香蕉av在线
      您正在使用IE低版瀏覽器,為了您的雷峰網(wǎng)賬號安全和更好的產(chǎn)品體驗,強烈建議使用更快更安全的瀏覽器
      此為臨時鏈接,僅用于文章預(yù)覽,將在時失效
      人工智能開發(fā)者 正文
      發(fā)私信給AI研習(xí)社-譯站
      發(fā)送

      0

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      本文作者: AI研習(xí)社-譯站 2018-06-21 15:58
      導(dǎo)語:我們選擇了 2017 幸福度數(shù)集,一個來自 Kaggle 平臺的數(shù)據(jù)集。

      本文為雷鋒網(wǎng)字幕組編譯的技術(shù)博客 Happiness 2017,作者為 Javad Zabihi。

      翻譯 | Binpluto    整理 |  孔令雙

      雷鋒網(wǎng) AI 研習(xí)社:我們選擇了 2017 幸福度數(shù)集,一個來自 Kaggle 平臺的數(shù)集。這份數(shù)集給出了來自世界 155 個國家,關(guān)于包括家庭狀況,平均壽命,經(jīng)濟水平,寬容度,對政府的信任感,自由度和反烏托邦殘留在內(nèi)的七方面因素的幸福等級和幸福值。這七項的分值之和就是幸福值,幸福值越高,幸福等級越低。因此,很顯然七項中的每一項分值越高,意味著幸福水平越高 。我們將這七項因素定義為影響幸福的因素。反烏托邦是烏托邦的對立面,意味著最低的幸福水平。它將被視為其他國家判斷他們離最不幸福國家有多遠(yuǎn)的參考對象。

      我的報告包含了以下三部分:

      • 凈化

      • 可視化

      • 預(yù)測

      任務(wù)

      選擇這項任務(wù)的目的在于找出,哪些因素對人們過上幸福的生活更重要。根據(jù)結(jié)果,人們和國家可以專注于更關(guān)鍵的因素來實現(xiàn)更高的幸福水準(zhǔn)。我們也將運用不同的機器學(xué)習(xí)算法來預(yù)測幸福值并比較預(yù)測結(jié)果,來判斷哪一種算法更適用于這個數(shù)集。

      凈化

      現(xiàn)在我們可以導(dǎo)入數(shù)集并觀察幸福變量的結(jié)構(gòu)。我們的數(shù)集已經(jīng)相當(dāng)?shù)恼麧崳贿^,我們?nèi)詫鞒鲆稽c調(diào)整來使它看起來更好。

      library(plyr)

      library(dplyr)

      library(tidyverse)

      library(lubridate)

      library(caTools)

      library(ggplot2)

      library(ggthemes)

      library(reshape2)

      library(data.table)

      library(tidyr)

      library(corrgram)       

      library(corrplot)

      library(formattable)

      library(cowplot)

      library(ggpubr)

      library(plot3D)

      # World happiness report 2017

      Happiness <- read.csv("../input/2017.csv")

      str(Happiness)

      ## 'data.frame':    155 obs. of  12 variables:

      ##  $ Country                      : Factor w/ 155 levels "Afghanistan",..: 105 38 58 133 45 99 26 100 132 7 ...

      ##  $ Happiness.Rank               : int  1 2 3 4 5 6 7 8 9 10 ...

      ##  $ Happiness.Score              : num  7.54 7.52 7.5 7.49 7.47 ...

      ##  $ Whisker.high                 : num  7.59 7.58 7.62 7.56 7.53 ...

      ##  $ Whisker.low                  : num  7.48 7.46 7.39 7.43 7.41 ...

      ##  $ Economy..GDP.per.Capita.     : num  1.62 1.48 1.48 1.56 1.44 ...

      ##  $ Family                       : num  1.53 1.55 1.61 1.52 1.54 ...

      ##  $ Health..Life.Expectancy.     : num  0.797 0.793 0.834 0.858 0.809 ...

      ##  $ Freedom                      : num  0.635 0.626 0.627 0.62 0.618 ...

      ##  $ Generosity                   : num  0.362 0.355 0.476 0.291 0.245 ...

      ##  $ Trust..Government.Corruption.: num  0.316 0.401 0.154 0.367 0.383 ...

      ##  $ Dystopia.Residual            : num  2.28 2.31 2.32 2.28 2.43 ...

      我們已經(jīng)觀察了包含在數(shù)集內(nèi)的變量,他們對應(yīng)的類,以及每一個變量中的前幾個觀察項。事實上,這個數(shù)集含有 155 個觀察項和 12 個變量。我認(rèn)為一些變量名還不夠明了,于是決定改掉其中一些變量名。同時,盒須低和盒須高的變量將從數(shù)據(jù)集中移除,因為這些變量只能使幸福值的可靠區(qū)間變得更低或更高而對于可視化和預(yù)測方面沒有實際意義。

      # Changing the name of columns

      colnames (Happiness) <- c("Country", "Happiness.Rank", "Happiness.Score",

                                "Whisker.High", "Whisker.Low", "Economy", "Family",

                                "Life.Expectancy", "Freedom", "Generosity",

                                "Trust", "Dystopia.Residual")

      # Country: Name of countries

      # Happiness.Rank: Rank of the country based on the Happiness Score

      # Happiness.Score: Happiness measurement on a scale of 0 to 10

      # Whisker.High: Upper confidence interval of happiness score

      # Whisker.Low: Lower confidence interval of happiness score

      # Economy: The value of all final goods and services produced within a nation in a given year

      # Per capita GDP is a measure of the total output of a country that takes the gross domestic product (GDP) and divides it by the number of people in that country. The per capita GDP is especially useful when comparing one country to another, because it shows the relative performance of the countries.

      # Family: Importance of having a family

      # Life.Expectancy: Importance of health and amount of time prople expect to live

      # Freedom: Importance of freedom in each country

      # Generosity: The quality of being kind and generous

      # Trust: Perception of corruption in a government

      # Dystopia.Residual: Plays as a reference


      # Deleting unnecessary columns (Whisker.high and Whisker.low)


      Happiness <- Happiness[, -c(4,5)]

      下一步是在大洲的數(shù)據(jù)列表中添入另一列。我想要對不同的洲進行工作,以探究不同的因素對得到更高的幸福值是否會有不同的影響效果。亞洲,非洲,北美洲,南美洲,歐洲以及澳洲是數(shù)集里的六大洲。接著我將各大洲的這一列移到第二列,因為我認(rèn)為這樣的布局安排會使得數(shù)據(jù)集更直觀。最后, 我將大洲變量的數(shù)據(jù)類型改成指數(shù)型,這樣可以簡化可視化方面的工作。現(xiàn)在我們可以看到數(shù)集的最終結(jié)構(gòu),它包含了 155 個觀察項和 11 個變量。國家和大洲都是指數(shù)型變量,幸福等級是整數(shù)型變量,剩余變量都是數(shù)字型。

      # Creating a new column for continents


      Happiness$Continent <- NA


      Happiness$Continent[which(Happiness$Country %in% c("Israel", "United Arab Emirates", "Singapore", "Thailand", "Taiwan Province of China",

                                         "Qatar", "Saudi Arabia", "Kuwait", "Bahrain", "Malaysia", "Uzbekistan", "Japan",

                                         "South Korea", "Turkmenistan", "Kazakhstan", "Turkey", "Hong Kong S.A.R., China", "Philippines",

                                         "Jordan", "China", "Pakistan", "Indonesia", "Azerbaijan", "Lebanon", "Vietnam",

                                         "Tajikistan", "Bhutan", "Kyrgyzstan", "Nepal", "Mongolia", "Palestinian Territories",

                                         "Iran", "Bangladesh", "Myanmar", "Iraq", "Sri Lanka", "Armenia", "India", "Georgia",

                                         "Cambodia", "Afghanistan", "Yemen", "Syria"))] <- "Asia"

      Happiness$Continent[which(Happiness$Country %in% c("Norway", "Denmark", "Iceland", "Switzerland", "Finland",

                                         "Netherlands", "Sweden", "Austria", "Ireland", "Germany",

                                         "Belgium", "Luxembourg", "United Kingdom", "Czech Republic",

                                         "Malta", "France", "Spain", "Slovakia", "Poland", "Italy",

                                         "Russia", "Lithuania", "Latvia", "Moldova", "Romania",

                                         "Slovenia", "North Cyprus", "Cyprus", "Estonia", "Belarus",

                                         "Serbia", "Hungary", "Croatia", "Kosovo", "Montenegro",

                                         "Greece", "Portugal", "Bosnia and Herzegovina", "Macedonia",

                                         "Bulgaria", "Albania", "Ukraine"))] <- "Europe"

      Happiness$Continent[which(Happiness$Country %in% c("Canada", "Costa Rica", "United States", "Mexico",  

                                         "Panama","Trinidad and Tobago", "El Salvador", "Belize", "Guatemala",

                                         "Jamaica", "Nicaragua", "Dominican Republic", "Honduras",

                                         "Haiti"))] <- "North America"

      Happiness$Continent[which(Happiness$Country %in% c("Chile", "Brazil", "Argentina", "Uruguay",

                                         "Colombia", "Ecuador", "Bolivia", "Peru",

                                         "Paraguay", "Venezuela"))] <- "South America"

      Happiness$Continent[which(Happiness$Country %in% c("New Zealand", "Australia"))] <- "Australia"

      Happiness$Continent[which(is.na(Happiness$Continent))] <- "Africa"



      # Moving the continent column's position in the dataset to the second column


      Happiness <- Happiness %>% select(Country,Continent, everything())


      # Changing Continent column to factor


      Happiness$Continent <- as.factor(Happiness$Continent)


      str(Happiness)

      ## 'data.frame':    155 obs. of  11 variables:

      ##  $ Country          : Factor w/ 155 levels "Afghanistan",..: 105 38 58 133 45 99 26 100 132 7 ...

      ##  $ Continent        : Factor w/ 6 levels "Africa","Asia",..: 4 4 4 4 4 4 5 3 4 3 ...

      ##  $ Happiness.Rank   : int  1 2 3 4 5 6 7 8 9 10 ...

      ##  $ Happiness.Score  : num  7.54 7.52 7.5 7.49 7.47 ...

      ##  $ Economy          : num  1.62 1.48 1.48 1.56 1.44 ...

      ##  $ Family           : num  1.53 1.55 1.61 1.52 1.54 ...

      ##  $ Life.Expectancy  : num  0.797 0.793 0.834 0.858 0.809 ...

      ##  $ Freedom          : num  0.635 0.626 0.627 0.62 0.618 ...

      ##  $ Generosity       : num  0.362 0.355 0.476 0.291 0.245 ...

      ##  $ Trust            : num  0.316 0.401 0.154 0.367 0.383 ...

      ##  $ Dystopia.Residual: num  2.28 2.31 2.32 2.28 2.43 ...

      可視化

      在這一章節(jié),我們將處理不同的變量來找出它們之間的相關(guān)性。

      相關(guān)性曲線

      讓我們來觀察數(shù)集中各數(shù)字型變量之間的相關(guān)性。

      ########## Correlation between variables


      # Finding the correlation between numerical columns

      Num.cols <- sapply(Happiness, is.numeric)

      Cor.data <- cor(Happiness[, Num.cols])


      corrplot(Cor.data, method = 'color')  

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      顯然,“幸福等級” 與所有其他所有數(shù)字型變量之間呈負(fù)相關(guān)關(guān)系。換句話說就是,幸福等級越低,幸福值就越高,且其他七個因素對幸福的貢獻(xiàn)越大。 所以把幸福等級這個因素刪除 ,再次觀察相關(guān)性。

      # Create a correlation plot

      newdatacor = cor(Happiness[c(4:11)])

      corrplot(newdatacor, method = "number")

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      根據(jù)以上的相關(guān)系數(shù)圖,經(jīng)濟水平,平均壽命和家庭支出對幸福感起最關(guān)鍵的作用。對政府的信任感和慷慨度對幸福值的影響最小。

      比較不同的洲的幸福變量

      來計算一下平均幸福值和每個洲的其他七個因素的平均值。然后將不同的變量和對應(yīng)的數(shù)值轉(zhuǎn)化到單獨的列中。最終,使用 ggplot 來表現(xiàn)不同大陸之間的差異性。

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      Happiness.Continent <- Happiness %>%

                                select(-3) %>%

                                group_by(Continent) %>%

                                summarise_at(vars(-Country), funs(mean(., na.rm=TRUE)))


      # Or we can use aggregate

      # aggregate(Happiness[, 4:11], list(Happiness$Continent), mean)


      # Melting the "Happiness.Continent" dataset

      Happiness.Continent.melt <- melt(Happiness.Continent)



      # Faceting

      ggplot(Happiness.Continent.melt, aes(y=value, x=Continent, color=Continent, fill=Continent)) + 

        geom_bar( stat="identity") +    

        facet_wrap(~variable) + theme_bw() +

        theme(axis.text.x = element_text(angle = 90, hjust = 1)) +

        labs(title = "Average value of happiness variables for different continents", 

             y = "Average value") 

      我們可以看到,澳大利亞在除反烏托邦殘余方面之外,幾乎所有的方面都達(dá)到最高的平均值,接著是歐洲,北美洲和南美洲幾乎相同的幸福平均值和剩余七方面的平均值。最后是,亞洲和非洲在所有方面都拿到了最低分。

      每個洲的相關(guān)曲線

      我們來看每個洲不同變量間的相關(guān)性。

      corrgram(Happiness %>% select(-3) %>% filter(Continent == "Africa"), order=TRUE,

               upper.panel=panel.cor, main="Happiness Matrix for Africa")

      在非洲 “幸福值” 和其他變量的相關(guān)性:

      經(jīng)濟水平 > 家庭狀況 > 平均壽命> 反烏托邦殘留 > 自由度

      幸福值與對政府的信任感之間沒有相關(guān)性。

      幸福值和慷慨度之間呈反相關(guān)關(guān)系。

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      corrgram(Happiness %>% select(-3) %>% filter(Continent == "Asia"), order=TRUE,

               upper.

      =panel.cor, main="Happiness Matrix for Asia")

      在亞洲 “幸福值” 和其他變量的相關(guān)性: 

      經(jīng)濟水平 > 家庭狀況 > 平均壽命 > 自由度 > 對政府的信任感 > 反烏托邦殘留

      幸福值和慷慨度之間沒有相關(guān)性。

      corrgram(Happiness %>% select(-3) %>% filter(Continent == "Europe"), order=TRUE,

               upper.panel=panel.cor, main="Happiness Matrix for Europe")

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      在歐洲 “幸福值” 和其他變量的相關(guān)性: 

      自由度 > 對政府的信任感 > 經(jīng)濟水平 > 家庭狀況 > 反烏托邦殘留 > 平均壽命 > 慷慨度

      歐洲出現(xiàn)了慷慨讀和幸福值之間最高的相關(guān)性。

      corrgram(Happiness %>% select(-3) %>% filter(Continent == "North America"), order=TRUE,

               upper.panel=panel.cor, main="Happiness Matrix for North America")

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      在北美洲的 “幸福值” 和其他變量的相關(guān)性: 

      平均壽命 > 經(jīng)濟水平 > 自由度 > 家庭狀況 > 反烏托邦殘留 > 對政府的信任感

      在這里幸福值和慷慨度呈現(xiàn)反相關(guān)關(guān)系。

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      corrgram(Happiness %>% select(-3) %>% filter(Continent == "South America"), order=TRUE,

               upper.panel=panel.cor, main="Happiness Matrix for South America")

      在南美洲的 “幸福值” 和其他變量的相關(guān)性: 

      反烏托邦殘留 > 經(jīng)濟水平 > 平均壽命 > 自由度 > 慷慨度 > 對政府的信任感>家庭狀況

      在南美洲,家庭狀況是最不重要的。

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      比較不同大洲的幸福值

      我們將會使用散點圖,盒式圖和小提琴型圖來觀察幸福值在不同國家的分布,幸福值是如何填充在這些大洲的,并對每個大洲計算平均值和中位數(shù)。

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      ####### Happiness score for each continent


      gg1 <- ggplot(Happiness,

                    aes(x=Continent,

                        y=Happiness.Score,

                        color=Continent))+

        geom_point() + theme_bw() +

        theme(axis.title = element_text(family = "Helvetica", size = (8)))


      gg2 <- ggplot(Happiness , aes(x = Continent, y = Happiness.Score)) +

        geom_boxplot(aes(fill=Continent)) + theme_bw() +

        theme(axis.title = element_text(family = "Helvetica", size = (8)))


      gg3 <- ggplot(Happiness,aes(x=Continent,y=Happiness.Score))+

        geom_violin(aes(fill=Continent),alpha=0.7)+ theme_bw() +

        theme(axis.title = element_text(family = "Helvetica", size = (8)))


      # Compute descriptive statistics by groups

      stable <- desc_statby(Happiness, measure.var = "Happiness.Score",

                            grps = "Continent")

      stable <- stable[, c("Continent","mean","median")]

      names(stable) <- c("Continent", "Mean of happiness score","Median of happiness score")

      # Summary table plot

      stable.p <- ggtexttable(stable,rows = NULL, 

                               theme = ttheme("classic"))



      ggarrange(gg1, gg2, ncol = 1, nrow = 2)

      ggarrange(gg3, stable.p, ncol = 1, nrow = 2)

      正如我們所看到的那樣,澳大利亞的幸福值的中位數(shù)值最高。歐洲,南美洲,北美洲的中位數(shù)值并列第二。亞洲的數(shù)值最低,位于非洲以后。 我們可以得到不同大洲的幸福值范圍,以及幸福值的集中程度。

      帶有回歸線的散點圖

      通過給不同大洲畫散點圖,我們可以確定幸福數(shù)據(jù)與數(shù)集中的其他七個因素之間的相關(guān)性。

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      ggplot(subset(Happiness, Happiness$Continent != "Australia"), aes(x = Life.Expectancy, y = Happiness.Score)) + 

        geom_point(aes(color=Continent), size = 3, alpha = 0.8) +  

        geom_smooth(aes(color = Continent, fill = Continent), 

                    method = "lm", fullrange = TRUE) +

        facet_wrap(~Continent) +

        theme_bw() + labs(title = "Scatter plot with regression line")

      平均壽命與幸福值的相關(guān)性在歐洲,北美洲和亞洲比其他洲更明顯。值得一提的是,澳大利亞的數(shù)據(jù)將不被計入結(jié)果,因為澳洲只有兩個國家,關(guān)于它作出的散點圖將不會有任何意義。

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      ggplot(subset(Happiness, Happiness$Continent != "Australia"), aes(x = Economy, y = Happiness.Score)) + 

        geom_point(aes(color=Continent), size = 3, alpha = 0.8) +  

        geom_smooth(aes(color = Continent, fill = Continent), 

                    method = "lm", fullrange = TRUE) +

        facet_wrap(~Continent) +

        theme_bw() + labs(title = "Scatter plot with regression line")

      關(guān)于幸福值與經(jīng)濟水平的相關(guān)性,我們可以得到于前者很接近的結(jié)果。在這一個因素作用下,非洲的結(jié)果排在最末位。

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      ggplot(subset(Happiness, Happiness$Continent != "Australia"), aes(x = Freedom, y = Happiness.Score)) + 

        geom_point(aes(color=Continent), size = 3, alpha = 0.8) +  

        geom_smooth(aes(color = Continent, fill = Continent), 

                    method = "lm", fullrange = TRUE) +

        facet_wrap(~Continent) +

        theme_bw() + labs(title = "Scatter plot with regression line")

      自由度與幸福值的相關(guān)性,在歐洲和北美洲比其他任何洲都要顯著。

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      ggplot(subset(Happiness, Happiness$Continent != "Australia"), aes(x = Trust, y = Happiness.Score)) + 

        geom_point(aes(color=Continent), size = 3, alpha = 0.8) +  

        geom_smooth(aes(color = Continent, fill = Continent), 

                    method = "lm", fullrange = TRUE) +

        facet_wrap(~Continent) +

        theme_bw() + labs(title = "Scatter plot with regression line")

      在非洲,對政府的信任感和幸福值之間幾乎沒有相關(guān)性。

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      ggplot(subset(Happiness, Happiness$Continent != "Australia"), aes(x = Generosity, y = Happiness.Score)) + 

        geom_point(aes(color=Continent), size = 3, alpha = 0.8) +  

        geom_smooth(aes(color = Continent, fill = Continent), 

                    method = "lm", fullrange = TRUE) +

        facet_wrap(~Continent) +

        theme_bw() + labs(title = "Scatter plot with regression line")

      只有歐洲和南美洲對應(yīng)的回歸線的斜率為正。亞洲的回歸線是水平的,非洲和北美洲對應(yīng)的回歸線斜率為負(fù)。

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      ggplot(subset(Happiness, Happiness$Continent != "Australia"), aes(x = Family, y = Happiness.Score)) + 

        geom_point(aes(color=Continent), size = 3, alpha = 0.8) +  

        geom_smooth(aes(color = Continent, fill = Continent), 

                    method = "lm", fullrange = TRUE) +

        facet_wrap(~Continent) +

        theme_bw() + labs(title = "Scatter plot with regression line")

      對于南美洲而言,隨著家庭值增加,幸福值保持不變。

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      ggplot(subset(Happiness, Happiness$Continent != "Australia"), aes(x = Dystopia.Residual, y = Happiness.Score)) + 

        geom_point(aes(color=Continent), size = 3, alpha = 0.8) +  

        geom_smooth(aes(color = Continent, fill = Continent), 

                    method = "lm", fullrange = TRUE) +

        facet_wrap(~Continent) +

        theme_bw() + labs(title = "Scatter plot with regression line")

      針對反烏托邦殘留這項因素,所有大洲的表現(xiàn)近乎一致。

      對不同大洲著色的散點圖

      以下是觀察幸福值與不同變量的相關(guān)性,在不同大陸中分布的另一種方式。

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      #::::::::::::::::::::::::::::Family::::::::::::::::::::::::::::::

      sp <- ggscatter(Happiness, x = "Family", y = "Happiness.Score",

                      color = "Continent", palette = "jco",

                      size = 3, alpha = 0.6)

      # Create box plots of x/y variables

      # Box plot of the x variable

      xbp <- ggboxplot(Happiness$Family, width = 0.3, fill = "lightgray") +

        rotate() +

        theme_transparent()

      # Box plot of the y variable

      ybp <- ggboxplot(Happiness$Happiness.Score, width = 0.3, fill = "lightgray") +

        theme_transparent()

      # Create the external graphical objects

      # called a "grop" in Grid terminology

      xbp_grob <- ggplotGrob(xbp)

      ybp_grob <- ggplotGrob(ybp)

      # Place box plots inside the scatter plot

      xmin <- min(Happiness$Family); xmax <- max(Happiness$Family)

      ymin <- min(Happiness$Happiness.Score); ymax <- max(Happiness$Happiness.Score)

      yoffset <- (1/15)*ymax; xoffset <- (1/15)*xmax

      # Insert xbp_grob inside the scatter plot

      sp + annotation_custom(grob = xbp_grob, xmin = xmin, xmax = xmax, 

                             ymin = ymin-yoffset, ymax = ymin+yoffset) +

        # Insert ybp_grob inside the scatter plot

        annotation_custom(grob = ybp_grob,

                          xmin = xmin-xoffset, xmax = xmin+xoffset, 

                          ymin = ymin, ymax = ymax)

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      #::::::::::::::::::::::::::::Generosity::::::::::::::::::::::::::::::

      sp <- ggscatter(Happiness, x = "Generosity", y = "Happiness.Score",

                      color = "Continent", palette = "jco",

                      size = 3, alpha = 0.6)

      # Create box plots of x/y variables

      # Box plot of the x variable

      xbp <- ggboxplot(Happiness$Generosity, width = 0.3, fill = "lightgray") +

        rotate() +

        theme_transparent()

      # Box plot of the y variable

      ybp <- ggboxplot(Happiness$Happiness.Score, width = 0.3, fill = "lightgray") +

        theme_transparent()

      # Create the external graphical objects

      # called a "grop" in Grid terminology

      xbp_grob <- ggplotGrob(xbp)

      ybp_grob <- ggplotGrob(ybp)

      # Place box plots inside the scatter plot

      xmin <- min(Happiness$Generosity); xmax <- max(Happiness$Generosity)

      ymin <- min(Happiness$Happiness.Score); ymax <- max(Happiness$Happiness.Score)

      yoffset <- (1/15)*ymax; xoffset <- (1/15)*xmax

      # Insert xbp_grob inside the scatter plot

      sp + annotation_custom(grob = xbp_grob, xmin = xmin, xmax = xmax, 

                             ymin = ymin-yoffset, ymax = ymin+yoffset) +

        # Insert ybp_grob inside the scatter plot

        annotation_custom(grob = ybp_grob,

                          xmin = xmin-xoffset, xmax = xmin+xoffset, 

                          ymin = ymin, ymax = ymax)

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      #::::::::::::::::::::::::::::Life.Expectancy::::::::::::::::::::::::::::::

      sp <- ggscatter(Happiness, x = "Life.Expectancy", y = "Happiness.Score",

                      color = "Continent", palette = "jco",

                      size = 3, alpha = 0.6)

      # Create box plots of x/y variables

      # Box plot of the x variable

      xbp <- ggboxplot(Happiness$Life.Expectancy, width = 0.3, fill = "lightgray") +

        rotate() +

        theme_transparent()

      # Box plot of the y variable

      ybp <- ggboxplot(Happiness$Happiness.Score, width = 0.3, fill = "lightgray") +

        theme_transparent()

      # Create the external graphical objects

      # called a "grop" in Grid terminology

      xbp_grob <- ggplotGrob(xbp)

      ybp_grob <- ggplotGrob(ybp)

      # Place box plots inside the scatter plot

      xmin <- min(Happiness$Life.Expectancy); xmax <- max(Happiness$Life.Expectancy)

      ymin <- min(Happiness$Happiness.Score); ymax <- max(Happiness$Happiness.Score)

      yoffset <- (1/15)*ymax; xoffset <- (1/15)*xmax

      # Insert xbp_grob inside the scatter plot

      sp + annotation_custom(grob = xbp_grob, xmin = xmin, xmax = xmax, 

                             ymin = ymin-yoffset, ymax = ymin+yoffset) +

        # Insert ybp_grob inside the scatter plot

        annotation_custom(grob = ybp_grob,

                          xmin = xmin-xoffset, xmax = xmin+xoffset, 

                          ymin = ymin, ymax = ymax)

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      #::::::::::::::::::::::::::::Freedom::::::::::::::::::::::::::::::

      sp <- ggscatter(Happiness, x = "Freedom", y = "Happiness.Score",

                      color = "Continent", palette = "jco",

                      size = 3, alpha = 0.6)

      # Create box plots of x/y variables

      # Box plot of the x variable

      xbp <- ggboxplot(Happiness$Freedom, width = 0.3, fill = "lightgray") +

        rotate() +

        theme_transparent()

      # Box plot of the y variable

      ybp <- ggboxplot(Happiness$Happiness.Score, width = 0.3, fill = "lightgray") +

        theme_transparent()

      # Create the external graphical objects

      # called a "grop" in Grid terminology

      xbp_grob <- ggplotGrob(xbp)

      ybp_grob <- ggplotGrob(ybp)

      # Place box plots inside the scatter plot

      xmin <- min(Happiness$Freedom); xmax <- max(Happiness$Freedom)

      ymin <- min(Happiness$Happiness.Score); ymax <- max(Happiness$Happiness.Score)

      yoffset <- (1/15)*ymax; xoffset <- (1/15)*xmax

      # Insert xbp_grob inside the scatter plot

      sp + annotation_custom(grob = xbp_grob, xmin = xmin, xmax = xmax, 

                             ymin = ymin-yoffset, ymax = ymin+yoffset) +

        # Insert ybp_grob inside the scatter plot

        annotation_custom(grob = ybp_grob,

                          xmin = xmin-xoffset, xmax = xmin+xoffset, 

                          ymin = ymin, ymax = ymax)

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      #::::::::::::::::::::::::::::Economy::::::::::::::::::::::::::::::

      sp <- ggscatter(Happiness, x = "Economy", y = "Happiness.Score",

                      color = "Continent", palette = "jco",

                      size = 3, alpha = 0.6)

      # Create box plots of x/y variables

      # Box plot of the x variable

      xbp <- ggboxplot(Happiness$Economy, width = 0.3, fill = "lightgray") +

        rotate() +

        theme_transparent()

      # Box plot of the y variable

      ybp <- ggboxplot(Happiness$Happiness.Score, width = 0.3, fill = "lightgray") +

        theme_transparent()

      # Create the external graphical objects

      # called a "grop" in Grid terminology

      xbp_grob <- ggplotGrob(xbp)

      ybp_grob <- ggplotGrob(ybp)

      # Place box plots inside the scatter plot

      xmin <- min(Happiness$Economy); xmax <- max(Happiness$Economy)

      ymin <- min(Happiness$Happiness.Score); ymax <- max(Happiness$Happiness.Score)

      yoffset <- (1/15)*ymax; xoffset <- (1/15)*xmax

      # Insert xbp_grob inside the scatter plot

      sp + annotation_custom(grob = xbp_grob, xmin = xmin, xmax = xmax, 

                             ymin = ymin-yoffset, ymax = ymin+yoffset) +

        # Insert ybp_grob inside the scatter plot

        annotation_custom(grob = ybp_grob,

                          xmin = xmin-xoffset, xmax = xmin+xoffset, 

                          ymin = ymin, ymax = ymax)

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      #::::::::::::::::::::::::::::Trust::::::::::::::::::::::::::::::

      sp <- ggscatter(Happiness, x = "Trust", y = "Happiness.Score",

                      color = "Continent", palette = "jco",

                      size = 3, alpha = 0.6)

      # Create box plots of x/y variables

      # Box plot of the x variable

      xbp <- ggboxplot(Happiness$Trust, width = 0.3, fill = "lightgray") +

        rotate() +

        theme_transparent()

      # Box plot of the y variable

      ybp <- ggboxplot(Happiness$Happiness.Score, width = 0.3, fill = "lightgray") +

        theme_transparent()

      # Create the external graphical objects

      # called a "grop" in Grid terminology

      xbp_grob <- ggplotGrob(xbp)

      ybp_grob <- ggplotGrob(ybp)

      # Place box plots inside the scatter plot

      xmin <- min(Happiness$Trust); xmax <- max(Happiness$Trust)

      ymin <- min(Happiness$Happiness.Score); ymax <- max(Happiness$Happiness.Score)

      yoffset <- (1/15)*ymax; xoffset <- (1/15)*xmax

      # Insert xbp_grob inside the scatter plot

      sp + annotation_custom(grob = xbp_grob, xmin = xmin, xmax = xmax, 

                             ymin = ymin-yoffset, ymax = ymin+yoffset) +

        # Insert ybp_grob inside the scatter plot

        annotation_custom(grob = ybp_grob,

                          xmin = xmin-xoffset, xmax = xmin+xoffset, 

                          ymin = ymin, ymax = ymax)


      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      #::::::::::::::::::::::::::::Dystopia.Residual::::::::::::::::::::::::::::::

      sp <- ggscatter(Happiness, x = "Dystopia.Residual", y = "Happiness.Score",

                      color = "Continent", palette = "jco",

                      size = 3, alpha = 0.6)

      # Create box plots of x/y variables

      # Box plot of the x variable

      xbp <- ggboxplot(Happiness$Dystopia.Residual, width = 0.3, fill = "lightgray") +

        rotate() +

        theme_transparent()

      # Box plot of the y variable

      ybp <- ggboxplot(Happiness$Happiness.Score, width = 0.3, fill = "lightgray") +

        theme_transparent()

      # Create the external graphical objects

      # called a "grop" in Grid terminology

      xbp_grob <- ggplotGrob(xbp)

      ybp_grob <- ggplotGrob(ybp)

      # Place box plots inside the scatter plot

      xmin <- min(Happiness$Dystopia.Residual); xmax <- max(Happiness$Dystopia.Residual)

      ymin <- min(Happiness$Happiness.Score); ymax <- max(Happiness$Happiness.Score)

      yoffset <- (1/15)*ymax; xoffset <- (1/15)*xmax

      # Insert xbp_grob inside the scatter plot

      sp + annotation_custom(grob = xbp_grob, xmin = xmin, xmax = xmax, 

                             ymin = ymin-yoffset, ymax = ymin+yoffset) +

        # Insert ybp_grob inside the scatter plot

        annotation_custom(grob = ybp_grob,

                          xmin = xmin-xoffset, xmax = xmin+xoffset, 

                          ymin = ymin, ymax = ymax)

      三維圖

      在可視化的最后一部分,讓我們作一些特別的圖。 我必須聲明,我并不支持三維圖或任何特別的圖,不過我們可以通過它們增加一些趣味性! 

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      scatter3D(Happiness$Freedom, Happiness$Life.Expectancy, Happiness$Happiness.Score, phi = 0, bty = "g",

                pch = 20, cex = 2, ticktype = "detailed",

                main = "Happiness data", xlab = "Freedom",

                ylab ="Life.Expectancy", zlab = "Happiness.Score")


      根據(jù)這個圖可以得知,平均壽命越高,自由度越強,幸福值就會越高。

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      scatter3D(Happiness$Generosity, Happiness$Economy, Happiness$Happiness.Score, phi = 0, bty = "g",

                pch = 20, cex = 2, ticktype = "detailed",

                main = "Happiness data", xlab = "Generosity",

                ylab ="Economy", zlab = "Happiness.Score")

      經(jīng)濟水平越高,慷慨度越低,會導(dǎo)致更高的幸福水平。

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      scatter3D(Happiness$Trust, Happiness$Freedom, Happiness$Happiness.Score, phi = 0, bty = "g",

                pch = 20, cex = 2, ticktype = "detailed",

                main = "Happiness data", xlab = "Trust",

                ylab ="Freedom", zlab = "Happiness.Score")

      總體而言,對政府的信任感對于取得更高的幸福值并不會起到關(guān)鍵作用。不過我們可以看到,對于那些自由度很重要并且幸福值超過 7 分的國家來說,信任感很重要。

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      scatter3D(Happiness$Family, Happiness$Economy, Happiness$Happiness.Score, phi = 0, bty = "g",

                pch = 20, cex = 2, ticktype = "detailed",

                main = "Happiness data", xlab = "Trust",

                ylab ="Economy", zlab = "Happiness.Score")

      對于那些幸福值低于 5 分的國家通過相關(guān)性曲線可以看出,隨著經(jīng)濟水平的提高和幸福值的增加,信任感保持不變。而在幸福值 5 分這一點之后,信任感對幸福值增加的影響逐漸增大。

      預(yù)測

      在這一章,我們將運用不同的機器學(xué)習(xí)算法來預(yù)測幸福值。首先,我們需要將數(shù)集分成訓(xùn)練組和測試組。我們的因變量是幸福值,自變量分別是家庭狀況,經(jīng)濟水平,信任感,自由度,慷慨度和反烏托邦殘留。

      # Splitting the dataset into the Training set and Test set

      # install.packages('caTools')

      library(caTools)

      set.seed(123)

      dataset <- Happiness[4:11]

      split = sample.split(dataset$Happiness.Score, SplitRatio = 0.8)

      training_set = subset(dataset, split == TRUE)

      test_set = subset(dataset, split == FALSE)

      多元線性回歸

      # Fitting Multiple Linear Regression to the Training set

      regressor_lm = lm(formula = Happiness.Score ~ .,

                     data = training_set)


      summary(regressor_lm)

      ## 

      ## Call:

      ## lm(formula = Happiness.Score ~ ., data = training_set)

      ## 

      ## Residuals:

      ##        Min         1Q     Median         3Q        Max 

      ## -5.907e-04 -2.008e-04 -1.600e-07  2.510e-04  4.855e-04 

      ## 

      ## Coefficients:

      ##                    Estimate Std. Error   t value Pr(>|t|)    

      ## (Intercept)       1.701e-04  1.509e-04     1.127    0.262    

      ## Economy           1.000e+00  1.300e-04  7690.839   <2e-16 ***

      ## Family            9.999e-01  1.253e-04  7981.804   <2e-16 ***

      ## Life.Expectancy   9.997e-01  2.122e-04  4711.655   <2e-16 ***

      ## Freedom           9.999e-01  2.245e-04  4453.253   <2e-16 ***

      ## Generosity        1.000e+00  2.310e-04  4330.040   <2e-16 ***

      ## Trust             9.997e-01  3.335e-04  2997.191   <2e-16 ***

      ## Dystopia.Residual 1.000e+00  5.452e-05 18343.021   <2e-16 ***

      ## ---

      ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

      ## 

      ## Residual standard error: 0.0002848 on 116 degrees of freedom

      ## Multiple R-squared:      1,  Adjusted R-squared:      1 

      ## F-statistic: 2.689e+08 on 7 and 116 DF,  p-value: < 2.2e-16

      結(jié)果表明,所有的自變量都有重要的作用且校準(zhǔn)的 R 的平方為 1!就如我們探討的那樣,自變量與因變量之間存在明顯的線性關(guān)系。同樣,我要強調(diào)自變量之和等于因變量,也就是幸福值。這就證明了校準(zhǔn)的 R 的平方為 1 這個結(jié)論。因此,推測多元線性回歸將可以 100% 正確的預(yù)測幸福值。

      ####### Predicting the Test set results

      y_pred_lm = predict(regressor_lm, newdata = test_set)


      Pred_Actual_lm <- as.data.frame(cbind(Prediction = y_pred_lm, Actual = test_set$Happiness.Score))


      gg.lm <- ggplot(Pred_Actual_lm, aes(Actual, Prediction )) +

        geom_point() + theme_bw() + geom_abline() +

        labs(title = "Multiple Linear Regression", x = "Actual happiness score",

             y = "Predicted happiness score") +

        theme(plot.title = element_text(family = "Helvetica", face = "bold", size = (15)), 

              axis.title = element_text(family = "Helvetica", size = (10)))

      gg.lm

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)


      和預(yù)期一樣,實際和預(yù)測的對比圖證明了模型的準(zhǔn)確性。

      SVR(支持向量回歸)

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      # Fitting SVR to the dataset

      library(e1071)

      regressor_svr = svm(formula = Happiness.Score ~ .,

                      data = dataset,

                      type = 'eps-regression',

                      kernel = 'radial')

      # Predicting a new result

      y_pred_svr = predict(regressor_svr,  newdata = test_set)


      Pred_Actual_svr <- as.data.frame(cbind(Prediction = y_pred_svr, Actual = test_set$Happiness.Score))



      Pred_Actual_lm.versus.svr <- cbind(Prediction.lm = y_pred_lm, Prediction.svr = y_pred_svr, Actual = test_set$Happiness.Score)



      gg.svr <- ggplot(Pred_Actual_svr, aes(Actual, Prediction )) +

        geom_point() + theme_bw() + geom_abline() +

        labs(title = "SVR", x = "Actual happiness score",

             y = "Predicted happiness score") +

        theme(plot.title = element_text(family = "Helvetica", face = "bold", size = (15)), 

              axis.title = element_text(family = "Helvetica", size = (10)))

      gg.svr

      通過支持向量回歸預(yù)測出的幸福值具有相當(dāng)高的準(zhǔn)確性。

      決策樹回歸

      # Fitting Decision Tree Regression to the dataset

      library(rpart)

      regressor_dt = rpart(formula = Happiness.Score ~ .,

                        data = dataset,

                        control = rpart.control(minsplit = 10))

      # Predicting a new result with Decision Tree Regression

      y_pred_dt = predict(regressor_dt, newdata = test_set)


      Pred_Actual_dt <- as.data.frame(cbind(Prediction = y_pred_dt, Actual = test_set$Happiness.Score))



      gg.dt <- ggplot(Pred_Actual_dt, aes(Actual, Prediction )) +

        geom_point() + theme_bw() + geom_abline() +

        labs(title = "Decision Tree Regression", x = "Actual happiness score",

             y = "Predicted happiness score") +

        theme(plot.title = element_text(family = "Helvetica", face = "bold", size = (15)), 

              axis.title = element_text(family = "Helvetica", size = (10)))

      gg.dt

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      看起來對于這個數(shù)集,決策樹回歸不是一個好的選擇。我們來看看這個樹狀圖。

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      # Plotting the tree

      library(rpart.plot)

      prp(regressor_dt)

      隨機森林回歸

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      # Fitting Random Forest Regression to the dataset

      library(randomForest)

      set.seed(1234)

      regressor_rf = randomForest(x = dataset[-1],

                               y = dataset$Happiness.Score,

                               ntree = 500)

      # Predicting a new result with Random Forest Regression

      y_pred_rf = predict(regressor_rf, newdata = test_set)


      Pred_Actual_rf <- as.data.frame(cbind(Prediction = y_pred_rf, Actual = test_set$Happiness.Score))



      gg.rf <- ggplot(Pred_Actual_rf, aes(Actual, Prediction )) +

        geom_point() + theme_bw() + geom_abline() +

        labs(title = "Random Forest Regression", x = "Actual happiness score",

             y = "Predicted happiness score") +

        theme(plot.title = element_text(family = "Helvetica", face = "bold", size = (15)), 

              axis.title = element_text(family = "Helvetica", size = (10)))

      gg.rf

      隨機森林回歸得到的結(jié)果沒有支持向量回歸那么好,不過比決策樹回歸要好的多。

      神經(jīng)網(wǎng)絡(luò)

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      # Fitting Neural Net to the training set

      library(neuralnet)


      nn <- neuralnet(Happiness.Score ~ Economy + Family + Life.Expectancy + Freedom + Generosity + Trust + Dystopia.Residual,

                      data=training_set,hidden=10,linear.output=TRUE)

      plot(nn)

      predicted.nn.values <- compute(nn,test_set[,2:8])


      Pred_Actual_nn <- as.data.frame(cbind(Prediction = predicted.nn.values$net.result, Actual = test_set$Happiness.Score))


      gg.nn <- ggplot(Pred_Actual_nn, aes(Actual, V1 )) +

        geom_point() + theme_bw() + geom_abline() +

        labs(title = "Neural Net", x = "Actual happiness score",

             y = "Predicted happiness score") +

        theme(plot.title = element_text(family = "Helvetica", face = "bold", size = (15)), 

            axis.title = element_text(family = "Helvetica", size = (10)))

      gg.nn

      神經(jīng)網(wǎng)絡(luò)是僅次于多元線性回歸的最好的預(yù)測算法。事實上,通過神經(jīng)網(wǎng)絡(luò)模型預(yù)測幸福值,其準(zhǔn)確性接近 100% 。我們對多元線性回歸和神經(jīng)網(wǎng)絡(luò)模型計算均方差。

      MSE.lm <- sum((test_set$Happiness.Score - y_pred_lm)^2)/nrow(test_set)

      MSE.nn <- sum((test_set$Happiness.Score - predicted.nn.values$net.result)^2)/nrow(test_set)


      print(paste("Mean Squared Error (Multiple Linear Regression):", MSE.lm))

      ## [1] "Mean Squared Error (Multiple Linear Regression): 0.0000000912868493258188"

      print(paste("Mean Squared Error (Neural Net):", MSE.nn))

      ## [1] "Mean Squared Error (Neural Net): 0.00146160413611951"

      和預(yù)期一樣,多元線性回歸的均方差要小于神經(jīng)網(wǎng)絡(luò)模型的。

      實際結(jié)果與通過不同的機器學(xué)習(xí)算法的預(yù)測結(jié)果作對比

      再次觀察預(yù)測出的結(jié)果,來直觀感受預(yù)測結(jié)果的準(zhǔn)確性。


      ggarrange(gg.lm, gg.svr, gg.dt, gg.rf, gg.nn, ncol = 2, nrow = 3)

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      多元線性回歸和神經(jīng)網(wǎng)絡(luò)的結(jié)果最好,預(yù)測也幾乎相同。支持向量回歸和隨機森林回歸的預(yù)測準(zhǔn)確性占據(jù)第二位。最后是決策樹算法,對預(yù)測幸福值這項工作它的結(jié)果是最糟糕的。

      雷鋒網(wǎng)字幕組編譯。

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      雷峰網(wǎng)版權(quán)文章,未經(jīng)授權(quán)禁止轉(zhuǎn)載。詳情見轉(zhuǎn)載須知

      用 Kaggle 數(shù)據(jù)集來分析幸福指數(shù)

      分享:
      相關(guān)文章

      知情人士

      AI研習(xí)社(yanxishe.com)譯站頻道,傳播前沿人工智能知識,讓語言不再成為學(xué)習(xí)知識的門檻。(原雷鋒字幕組)
      當(dāng)月熱門文章
      最新文章
      請?zhí)顚懮暾埲速Y料
      姓名
      電話
      郵箱
      微信號
      作品鏈接
      個人簡介
      為了您的賬戶安全,請驗證郵箱
      您的郵箱還未驗證,完成可獲20積分喲!
      請驗證您的郵箱
      立即驗證
      完善賬號信息
      您的賬號已經(jīng)綁定,現(xiàn)在您可以設(shè)置密碼以方便用郵箱登錄
      立即設(shè)置 以后再說
      主站蜘蛛池模板: 无码va在线观看| xxxx免费| 1024视频在线| 97超碰中文字幕| 熟女中文网站| 国产精品人妻在线观看| 国产乱色国产精品播放视频| 五月天国产成人av免费观看 | 亚洲精品国产字幕久久不卡 | 丰满老熟女毛片| 中文有无人妻vs无码人妻激烈| 蜜桃av抽搐高潮一区二区| 亚洲精品另类| 多人伦交性欧美| 国产精品无码作爱| 国产无套内射普通话对白| 一区二区三区无效卡| 中文字幕天天躁日日躁狠狠躁免费| 少妇人妻无码专区视频| 无码一卡二卡| 国产tsAV| 性按摩xxxx在线观看| 免费无码中文字幕A级毛片| 亚洲sm另类一区二区三区| 久草精品视频| 日本一区二区久久精品亚洲中文无| 日韩美女乱淫试看视频多人| 亚洲精品国产美女久久久99| 日本无遮挡吸乳呻吟视频| 免费视频爱爱太爽了激情| 精品伊人久久久香线蕉| 精品va在线观看| 久久精品国产只有精品96| 精品国产av色欲果冻传媒| 又粗又大又硬又长又爽| 成人午夜精品无码区久久| 少妇粉嫩小泬喷水视频| 四虎网址| 曰批永久免费40分钟免费观看软件| 久久精品国产99国产精偷| jizzjizz欧美|