CART 6.0 ProEX 資料挖掘分析軟體
CART是Salford Systems的旗艦數據挖掘軟件,該軟件是一款功能強大、易操作的決策樹,能自動篩選複雜的數據
終極分類樹:
Salford Predictive
Modeler的CART®建模引擎是最終的分類樹,它徹底改變了高級分析領域,並開創了當前數據科學的時代。CART是現代數據挖掘中最重要的工具之一。
專有代碼:
從技術上講,CART建模引擎基於1984年由斯坦福大學和加州大學伯克利分校的四位世界知名統計學家引入的具有里程碑意義的數學理論。CART建模引擎是SPM的分類和回歸樹實現,是唯一體現原始專有代碼的決策樹軟件。
速度快,用途廣泛:
CART建模引擎的專利擴展專門用於增強市場研究和網絡分析的結果。CART建模引擎支持高速部署,允許Salford Predictive
Modeler的模型大規模實時預測和評分。多年來,CART建模引擎已成為分析師可用的最流行且易於使用的預測建模算法之一,它也被用作許多基於裝袋和增強的現代數據挖掘方法的基礎。
Classification and Regression Trees
CART® software is the ultimate classification tree that has
revolutionized the field of advanced analytics, and inaugurated the
current era of data science. CART is one of the most important
tools in modern data mining.
Features
Linear Combination Splits
Optimal tree selection based on area under ROC curve
User defined splits for the root node and its children
Translating models into Topology
Edit and modify the CART trees via FORCE command structures
RATIO of the improvements of the primary splitter and the
first competitor
Scoring of CV models as an Ensemble
Report impact of penalties in root node
New penalty against biased splits PENALTY BIAS (PENALTY /
BIAS, CONTBIAS, CATBIAS)
Automation: Generate models with alternative handling of
missing values (Automate MISSING_PENALTY)
Automation: Build a model using each splitting rule (six
for classification, two for regression) (Automate RULES)
Automation: Build a series of models varying the depth of
the tree (Automate DEPTH)
Automation: Build a series of models changing the minimum
required size on parent nodes (Automate ATOM)
Automation: Build a series of models changing the minimum
required size on child nodes (Automate MINCHILD)
Automation: Explore accuracy versus speed trade-off due to
potential sampling of records at each node in a tree
(Automate SUBSAMPLE)
Automation: Generates a series of N unsupervised-learning
models (Automate UNSUPERVISED)
Automation: Varies the RIN (Regression In the Node)
parameter through the series of values (Automate RIN)
Automation: Varying the number of "folds" used in
cross-validation (Automate CVFOLDS)
Automation: Repeat cross-validation process many times to
explore the variance of estimates (Automate CVREPEATED)
Automation: Build a series of models using a user-supplied
list of binning variables for cross-validation (Automate
CVBIN)
Automation: Check the validity of model performance using
Monte Carlo shuffling of the target (Automate
TARGETSHUFFLE)
Automation: Build two linked models, where the first one
predicts the binary event while the second one predicts the
amount (Automate RELATED). For example, predicting whether
someone will buy and how much they will spend
Automation: Indicates whether a variable importance matrix
report should be produced when possible (Automate VARIMP)
Automation: Saves the variable importance matrix to a
comma-separated file (Automate VARIMPFILE)
Automation: Generate models with alternative handling of
missing values (AUTOMATE MVI)
Hotspot detection for Automate UNSUPERVISED
Hotspot detection for Automate TARGET
Hotspot detection to identify the richest nodes across the
multiple trees
Differential Lift Modeling (Netlift/Uplift)
Profile tab in CART Summary window
Multiple user defined lists for linear combinations
Constrained trees
Ability to create and save dummy variables for every node
in the tree during scoring
Report basic stats on any variable of user choice at every
node in the tree
Comparison of learn vs. test performance at every node of
every tree in the sequence
Automation: Vary the priors for the specified class
(Automate PRIORS)
Automation: Build a series of models by progressively
removing misclassified records thus increasing the
robustness of trees and posssibly reducing model complexity
(Automate REFINE)
Automation: Bagging and ARCing using the legacy code
(COMBINE)
Automation: Build a series of models limiting the number of
nodes in a tree (Automate NODES)
Automation: Build a series of models trying each available
predictor as the root node splitter (Automate ROOT)
Automation: Explore the impact of favoring equal sized
child nodes by varying CART’s end cut parameter (Automate
POWER)
Automation: Explore the impact of penalty on categorical
predictors (Automate PENALTY=HLC)
Build a Random Forests model utlizing the CART engine to
gain alternative handling of missing values via surrogate
splits (Automate BOOTSTRAP RSPLIT)
Classification and Regression Trees
CART® software is the ultimate classification tree that has revolutionized the field of advanced analytics, and inaugurated the current era of data science. CART is one of the most important tools in modern data mining.
Features