基本机器学习模型的流程图
根据可用的反馈,机器学习任务分为三类:
- 监督学习:这些是基于输入和输出的人工构建模型。
- 无监督学习:这些模型依赖于人工输入。没有给学习算法贴上标签,模型必须自己找出结构。
- 强化学习:这些是人工输入的模型。没有给学习算法贴上标签。该算法通过给定的奖励和惩罚进行学习。
可用于每个类别的算法是:Algorithm Supervised Unsupervised Reinforcement Linear 1 0 0 Logistic 1 0 0 K-Means 1 1 0 Anomaly Detection 1 1 0 Neural Net 1 1 1 KNN 1 0 0 Decision Tee 1 0 0 Random Forest 1 0 0 SVM 1 0 0 Naive Bayes 1 0 0
下表给出了机器学习功能和各种任务的用途。要了解有关算法的更多信息,请单击此处。 Category Algorithm Function Use svm.SVC() svm.LinearSVC()Basic Regression Linear linear_model.LinearRegression() Lots of numerical data Logistic linear_model.LogisticRegression() Target variable is categorical Cluster Analysis K-Means cluster.KMeans() Similar datum into groups based on centroids Anomaly Detection covariance.EllipticalEnvelope() Finding outliers through grouping Classification Neural Net neural_network.MLPClassifier() Complex relationships. Prone to over fitting. K-NN neighbors.KNeighborsClassifier() Group membership based on proximity Decision Tee tree.DecisionTreeClassifier() If/then/else. Non-contiguous data. Can also be regression. Random Forest ensemble.RandomForestClassifier() Find best split randomly. Can also be regression SVM Maximum margin classifier. Fundamental. Data Science algorithm Naive Bayes GaussianNB() MultinominalNB() BernoulliNB() Updating knowledge step by step with new info Feature Reduction T-DISTRIB Stochastic NEIB Embedding manifold.TSNE() Visual high dimensional data. Convert similarity to joint probabilities Principle Component Analysis decomposition.PCA() Distill feature space into components that describe the greatest variance Canonical Correlation Analysis decomposition.CCA() Making sense of cross-correlation matrices Linear Discriminant Analysis lda.LDA() Linear combination of features that separates classes
The flowchart given below will help you give a rough guide of each estimator that will help to know more about the task and the ways to solve it using various ML techniques.