I have been working on this software CMAT since 1995 when my wife
went for work to New Orleans and I felt lonely in Raleigh NC.
Here
is a small summary of the history, content, and features of CMAT.
Since September 28, 2016, there is a new version of CMAT ready
for download from the web. This is probably the most bug and error
free version of the last five releases. There are maybe still a
very few minor things I could fix, maybe at the end of this year 2016.
Due to another and more urgent project, I must have a break in coding CMAT
starting Oktober 2016. For a 73 year old it will not be easy to
continue the CMAT coding in summer of 2017.
There are five new papers available describing some outlier analysis
of the voting behavior during the
 March 13, 2016 Landtagswahlen in
RheinlandPfalz
 March 13, 2016 Landtagswahlen in
BadenWuerttemberg
 March 13, 2016 Landtagswahlen in
SachsenAnhalt
 September 4, 2016 Landtagswahlen in
MecklenburgVorpommern
 September 8, 2016 Wahlen zum Abgeordnetenhaus in
Berlin
In accordance to the data there are 1662476 voters of which are 1635169 valid
and 25694 invalid, that means there is a total of 1660863 of votes. The remaining
1613 voters must have lost their vote before they were able to vote (either valid
or invalid) or some larger group of the voters must have voted with less than
one vote. The largest difference between the count of voters and the count of
valid plus invalid votes is 33 and happens to be at the letter voting
place 1488 = Neukoelln 3D W.
 March 26, 2017 Landtagswahlen Saarland: The data are available only at the level
of the city, i.e. even for the capital Saarbruecken there is only one row in
the data matrix when there are many hundreds of voting stations. That form of data
would average out most extrem values obtained from the seperate voting locations.
The Statistics office in Saarland notified me that the final results at the level
of the election stations would be available maybe at the start of August.
Looks like all votes would have to be recounted?
 May 7, 2017 Landtagswahlen in
SchleswigHolstein
 May 14, 2017 Landtagswahlen NordrheinWestfalen: Data are not yet available
 ....Bundestagswahl is coming :) .................
Unfortunately, due to specific demand, these papers are all written in German.
The methods of uncovering fraud are based on the assumption that it is
recognizeable as an outlier and will not be successful wherever fraud is
a general feature.
Today, on June 10, 2017, some new files are uploaded showing the MCD results
of Wahlbezirksdaten, when there are less than 3000 Wahlbezirke. Most of the
results of that new large scale MCD analysis are just confirming the results
which were already found with the methods used for the older papers.
" Wisst Ihr, Genossen", sagte Stalin, "was ich ueber diese Frage denke?
Ich meine, dass es voellig unwichtig ist, wer und wie man in der Partei
abstimmen wird; ueberaus wichtig ist nur das eine, naemlich wer und
wie man die Stimmen zaehlt."
Sogar Kamenew, der Stalin schon kennen musste, raeusperte sich vernehmlich.
in: Boris Bazhanow: "Ich war Stalins Sekretaer", Ullstein, 1977, Seite 68.
"Es muss demokratisch aussehen, aber wir muessen alles in der Hand haben."
Walter Ulbricht im Mai 1945,
in Mario Frank: " Walter Ulbricht: Eine deutsche Biographie",
bei Siedler Verlag
 Structural Equation Modeling
Some Guidelines
 with our automatic modeling function we are creating excellent
CFA (confirmatory factor analysis) models for your data;
(at no charge if we are not able to find models with p larger than 0.01)
 structural equation modeling (SEM) for metric and ordinal data
(multiple sample analysis)
CFA model improvement algorithm
 Linear and Nonlinear Optimization:
using CMAT and Matlab (Optimization Toolbox)
 Linear and Nonlinear Statistics:
 dimension reduction and variable selection
 normalization and analysis of microarray data
(using Bioconductor)
 structural equation modeling (SEM), IRT, factor analysis, and PLS
 Data Mining with CMAT, R, and SAS Enterprise Miner (v. 9.1.3)
 Programming in SAS: DATA Step, IML, STAT, ETS, OR, and EM
(using SAS software version 9.1.3)
 Programming CMAT, Matlab, R, C, and Fortran
 Release 1: 1996 (Copyright February 1997)
 Release 2: December 1999
 Release 3: December 2002
 Release 4: July 2007
 Release 5: January 2009
 Release 6: November 2011:
I'm now developing with MS Visual Studio 2010 (C/C++) and
Intel Parallel Studio (Fortran 90)
 Release 7: December 2013
 Release 8: December 2015
 Release 9: September 2016
I'm still looking for some people who want to work with me on this,
especially:
 Somebody who does some testing of the language and functions.
Enters bug reports. Would need to have some Math and Stat background.
 Somebody who does some marketing. Would need to know about competing
software in Math and Stat (Octave, Matlab, and SAS IML).
Wolfgang defines WCU as the ratio among the number of people who actually
use a software (i.e. would be even willing to pay for the use) and the
number of people who developed that software.
Now, since CMAT has only one developer, the smallest numbers of WCU
are 0 and 1, and compared to the high values of WCU for SPSS and SAS
(not even thinking of Google and Facebook) CMAT seems to be rather bad.
However, quite a number of packages of R don't do much better than CMAT:)
Please read this license carefully before using the software.
By installing or using this software, you are agreeing to be
bound by the terms of this license. If you do not agree to
the terms of this license, either contact the author or
promptly remove the software.
 This software may not be distributed to third parties.
The free version may only be used for nonprofit research and teaching.
 Using the software for commercial applications or for profit
needs a specific license agreement.
 Supply of any part of this software as part of another
software requires the separate prior written agreement of the
CMAT managers, which may include financial terms.
 This software is copyrighted and may not be modified,
decompiled, reverse engineered, or disassembled.
 Due acknowledgment shall be made of the use of CMAT
in research reports or publications.
"everything free comes without guarantee".
Patience is expected. The author is grateful for responses by users such as bug
reports or proposals for improvement.
At this time there is only a Windows version. A Linux version would be more
appropriate and will be out shortly.
The software and manual of CMAT are offered “AS IS” and without warranties as to
performance or merchantability. The seller’s and/or redistributors may have made
statements about this software. Any such statements do not constitute warranties
and shall not be relied on by the user in deciding whether to use this program.
This program is offered without any express or implied warranties whatsoever,
because of the diversity of conditions and hardware under which this program may
be used, no warranty of fitness for a particular purpose is offered.
The users of this software are advised to test the program thoroughly
before relying on it. The users must assume the entire risk of using the program;
any liability of seller, provider or manufacturer will be limited exclusively
to product replacement.
In no event shall the maker or distributor of CMAT be liable for any loss of profit
or any other commercial damage, including but not limited to special, incidental,
consequential or other damages in the use, installation and
application of CMAT.
There are two ways of installing CMAT: either you copy the
most important files from the download site or copy the complete
directory structure from a DVD distributed by the developer.
The files from this website represent the most recent but not
completely tested version whereas the files on DVD correspond
to an earlier, but in general more stable release.
The download files are zipped (using the 7z software with the tzip
option so it can be unzipped using the common ZIP software) but
needs a password for unzipping which can be obtained from the author
on demand (email, LinkedIn).
The following directory structure is highly recommended:
 Download and unzip the file (about 6 MB):
cmat_util.zip
creating the directory C:/cmat_util containing the
directory gnuplot and the files for using 7z which is used
by the CMAT function system().
 Create directory C:/cmat.
 Download into cmat and unzip the file (about 16 MB)
cmat_com.zip
creating the subdirectory cmat/com
which contains the executable cmat.exe,
a number of DLLs, and the message files for the output.
 Download into cmat and unzip the file (about 62 MB)
cmat_test.zip
creating the following subdirectories:
 cmat/test with general test examples
 cmat/tgplt with test examples for plotting
 cmat/tmicro with test examples for micro array data
 cmat/tnlp with test examples for optimization (LP and NLP)
 cmat/tsem with test examples for structural equation
modeling, factor analysis, rotation, and IRT modeling
 Download into cmat and unzip the file (about 24 MB)
cmat_data.zip
creating the directory cmat/data which contains
the following subdirectories:
 mytst is almost empty and would be some good
place for users work
 tdata contains a large number of data sets, many of
them used from the test examples. All files have the extension .dat
 Download into cmat and unzip the file (about 430 MB!)
cmat_save.zip
creating the cmat/save which contains a number of .dob files
generated using the obj2fil() function which can be read by
the fil2obj() function. These files are used by some
applications with very large data sets. This directory is only
needed if you use the functions obj2fil() and fil2obj()
in your CMAT input.
 Download into cmat and unzip the file (about 14 MB)
cmat_doc.zip
creating the directory cmat/doc which contains a number
of pdf files documenting CMAT, including all Newsletters
(see also below). The files for this directory can also be
accessed directly on the wcmat website.
When e.g. EMACS is installed, CMAT can now be used in command line or
batch mode in Windows, preferably in C:/cmat/mytst.
Only when using the Tcl/TK graphical interface, for online documentation
Adobe Reader must be installed
in a directory Reader (containing files AcroRd32.exe, AcroRd32.dll etc.)
preferably at the same hierarchy level as the cmat directory.
Then the help("string") function can be used to open the reference
manual at the specified term. You may move easily between the more than
400 terms (bookmarks) of the CMAT Reference Manual by clicking the
many more hyperlinks. The manual can also be opened at a specific page
or for searching a specified term.
The Tcl/TK GUI can be used also for accessing the reference manual.
For graphical output (plotting) CMAT has an interface to the gnuplot
software. In CMAT you may connect to gnuplot either interactively or
in batch mode (running scripts).
Most Linux distributions have gnuplot included. For Windows and other OS
gnuplot can be downloaded from the internet free of charge, see for example
here
(There is also a demo gallery.) For some "terminal" output of
gnuplot (like SVG, EPSLATEX, and PDF) you may have to download
some additional software (SVGViewer, graphixs, etc).
An excellent book about gnuplot is:
Philip K. Janert, "Gnuplot in Action",
Greenwich CT: Manning Publications Co., 2009
Toshihiko Kawano has his "notsofreqently asked questions"
here
and here something about
gnuplot tricks.
CMAT is a scripting language like Matlab or R or SAS/IML.
CMAT can be run either in batch mode or interactively with
command line input. In MS Windows this can be done either
 with one of the two available graphical interfaces,
one in Windows and one in Unix style,
 or either by calling cmat.exe in a DOS Command window,
or using Emacs (and evtl. BASH) in Unix style.
See remarks in the CMAT Tutorial document.
It is best to run CMAT (either interactively or in Batch mode)
from the mytst, test, tnlp, or tsem directory.
The test, tnlp, and tsem directories contain a large number
of batch example files all ending with the extension .inp
together with the .log and .txt output files. When running
those examples, e.g.
cmat tode.inp
you should obtain the same results, but the log file will show
the actual date of the execution. Note, that the CMAT input must always be
started with a { bracelet for the start of a compound statement.
At least one statement (maybe an empty one) must be run before the
script is closed by a mstching } bracelet.
You will find more information about downloading and running CMAT
in Windows, Linux, and Unix at the download site.
Bugs are usually fixed when they are found. However, patience is expected:
If you are aware of any problems with CMAT please contact the developer.
 There are some problems with the import() and export()
functions. They will probably be fixed during the next few months.
 There is a problem with the con (LPasL1) version of
the lp() function. The pcx and clp versions
can be used instead.
The following are .PDF files which are available for download
at the download site. Only the Tutorial and the
Summary Manual can be downloaded from here.
Use the right mouse button for downloading files.

Complete User's Manual (about 18 mb, close to 3000 pages):
 User Software License Agreement
 Introduction
 Installing and Running CMAT
 Restrictions and Comparisons
 Tutorial: Basic Elements of the CMAT Language
 Summary of Operators, Keywords, and Functions
 Reference Guide
 Some Details
 The Bibliography

CMAT Reference Manual(about 14 mb, more than 2000 pages):
 Reference Guide
 The Bibliography

CMAT Tutorial (about 200 pages):
 Introduction
 Tutorial: Basic Elements of the CMAT Language
 Summary of Operators, Keywords, and Functions
 The Bibliography
 CMAT Details and Examples:
 Details
 The Bibliography

CMAT Summary Manual (about 110 pages):
 Introduction
 Summary of Operators, Keywords, and Functions
 The Bibliography
Here some
short guidelines
about how to use the hyperref package in LaTeX.
Please note that the developments reported in the last posted newsletter
must not be implemented in the last posted software version on the net or
are not much tested and may not work with the software posted at the site.
 Scalars: (long) int, (double) real, (double) complex, string
 Vectors (dense and sparse) for all data types, even mixed
 Matrices (dense and sparse) for all data types, even mixed
and for some specific matrix types (diagonal, band, symmetric,
triangular)
 Tensors (dense and sparse) for all data types, even mixed
 Lists, where each entry can be scalar, vector, matrix, tensor,
list, or struct; entries are referred to by index
 Structures (since end of 2015), where each entry can be scalar,
vector, matrix, tensor, list, or struct; entries are referred
to by compound name
 KD Trees (not completely finished, internally only)
 An Important Language Extension:
This two page paper describes a new form of
matrix literal
permitting the input of matrices containing string data without quotes.
(Note, for using that unquoted form of string input, matrix literals
may not contain white space inside.)
 Data Objects in CMAT:
This paper sketches some aspects of the
data objects
implemented in CMAT.
 Tensor and List Operations in CMAT:
Many matrix operations have been extended to tensors. However, this
paper sketches also some more specific and additional operations for
tensors and data lists.
The download site also contains a number of technical reports
illustrating applications of CMAT:
 Semiannual Newsletters Starting 2003
The
CMAT Newsletters
report about the progress in the development and illustrate
some applications of CMAT.
 On the Use of Matrix Language:
This small
paper
illustrates the difference between educational and efficient programming.
Note, CMAT almost always knows when a matrix is symmetric and takes
advantage of this. Also, identity matrices are stored as diagonal matrices.
Sparsity in matrices and vectors is detected automatically.
Such examples are often found in statistics.
 CMAT Code for some Matlab Programs by
Olvi Mangasarian and Helen Zhang.
 Presentation at DAGStat Conference, Bielefeld, March 2007:
 Variable Selection Algorithm for Micro Array Data:
The analysis of gene expression data is currently a very challenging task.
However this paper shows that we can use CMAT to find a very small number
of genes from 22283 genes of an Affymetrix chip which yields an
exact classification of two kinds of cancer.
 On the new
CFA model improvement algorithm
 On the difference of p values for
exact logistic regression
computed by SAS PROC LOGISTIC, elrm in R, and CMAT.
CMAT was first written for Unix and later for Windows.
 Versions for Mac, Linux, and Unix (should be easy since C code
is portable, and lex and yacc are native in Unix)
 Dynamic binding of C and Fortran users code.
 Extending preprocessor commands.
 Newsletter January 2003
 Faster matrix concatenation
 Reading and Writing of Matlab version 5 .mat Files
 ISOREG(): Isotone Regression (PAVA: Pool Adjacent Violators Algorithm,
Optimal Scaling)
 VARSEL(): Single and multiple response variable selection:
Forward, backward, and stepwise selection.
All subset combinations or randomly generated samples.
 Newsletter March 2003
 SIGN2(a,b) and SIGN4(a1,a2,a3,b) signum functions
 CANCOR(): Canonical Correlation Analysis
 MBURG(): Modified Burg algorithm for one and twodimensional time series
 Newsletter May 2003
 GLIM(): fixed some bugs and added some observationwise stat and ROC curve
 GLMIXD(): fixed lots of bugs, added type 1 and type 3 estimates,
and added some observationwise stat and ROC curve
 CDF23(): 2 and 3 dimensional quadrature of normal and t distribution
 PROMEP(): experimental design
 NOHARM(): factor analysis for dichotomous (0,1) data with robust
(nonnormal) GOF and ASEs
 FACTOR(): exploratory factor analysis with robust (nonnormal) GOF and ASEs
 Newsletter July 2003
 SEM(): robust asympt. standard errors and robust SatorraBentler Chisquare
Jackknife for identifying model outliers ("misfits")
 SVM(): automatic parameter tuning and two new methods (NSVM and FSM)
block, split, and random cross validation
Jackknife for identifying model outliers ("misfits")
 Newsletter September 2003
 POLYCHOR(): polychoric correlation matrices and their covariance matrix
 ODE(): two new algorithms
 TRI2VEC(): move triangular (lower or upper) part of matrix to vector
 GENEREAD(): reading microarray data
 Newsletter November 2003
 KSTEST(): new features for KolmogorovSmirnov test
 KSPROB(): compute prob of Kolmogorov CDF
 HOTELL(): compute classic and robust onesample Hotellings Test
with confidence intervals
 PLS(): partial least squares (PLS) and principal components regression (PCR)
block, split, and random cross validation
 SVDTRIP(): compute singular triplets(U,D,V) for large and/or sparse matrices
 Newsletter January 2004
 SIR(): sliced inverse regression for dimension reduction
(including principal Hessian directions)
 Newsletter June 2004
 PLS(): randomization test for number of components added
 SVM(): new feature selection methods added
 GAROTTE(): feature selection algorithm by Breiman (1993)
 LARS(): group of feature selection methods including the following:
LARS: Least Angle Regression (Efron, Hastie, Johnstone & Tibshirani, 2002)
Lasso: Tibshirani (1996), Osborne, Presnell & Turlach (2000),
Foreward Stagewise: (Efron, Hastie, Johnstone & Tibshirani, 2002)
Ridge Regression
Elastic Net (Zou & Hastie; 2003)
Univariate Soft Thresholding (Donoho et al., 1995)
 NLKPCA(): nonlinear kernel PCA (Schoelkopf, ; Rosipal & Trejo; 2001))
 NLKPLS(): nonlinear kernel PLS (Bennett & Embrechts; 2003)
 Newsletter December 2004
 REG(): Jackknifing for outlier (misfit) detection added
 FROTATE(): many new rotation methods added (Bernaards & Jennrich, 2004)
 LRFORW(): linear LS forward selection method for very many variables
(no need for storing the large X'X matrix)
 RANDISC(): some generators for discrete random variates
(Marsaglia, Tsang, & Wang, 2004)
 SCREETST(): methods for testing the significant number of
eigen values of X'X or covariance matrices
or singular values of rectangular matrices
 SCALPHA(): computes the sample coefficient by Cronbach (1951)
with asymptotic standard error and confidence interval
 SPLIT(): simple CART algorithm for binary response
with Chisquare split criterion
 VARCLUS(): variable cluster algorithm for very many variables
(similar to SAS PROC VARCLUS, but without the need to store
the large X'X or covariance matrix)
 Newsletter July 2005
 PCA(): implements eight different algorithms of principal
component analysis including asymptotic standard errors
and confidence intervals for unrotated and rotated
component loadings (normal theory analytic and bootstrap)
 FACTOR(): asymptotic standard errors and confidence intervals
for orthogonal and obliquely rotated factor solutions were added
(normal theory analytic and bootstrap)
 NOHARM(): entire suite of rotation algorithms are added;
asymptotic standard errors and confidence intervals
for orthogonal and obliquely rotated factor solutions were added
(normal theory analytic and bootstrap)
 CENTROID(): implements methods for classical centroid decomposition
 GENEREAD(): for comma (or otherwise) separated data set input
 HISTOGRM(): implements computation of histogram frequencies
 IMPUTE(): implements various methods for missing value imputation
 NNMF(): implements algorithm for nonnegative matrix factorization
 PERMUTE(): obtains permutations and combinations (all or stepwise)
 QUANTILE(): compute quantiles
 SIMDID(): compute some (not so great) similarity and distance measures
 SVDUPD(): rankk update of the svd of a matrix
 Newsletter December 2005
 GLIM(): multinomial Logit model is added
 ANACOR(): correspondence analysis of frequency tables (Gifi, 1990)
 ANAPROF(): correspondence analysis of profile data (Gifi, 1990)
 PRINCALS(): principal component analysis for categorical data (Gifi, 1990)
 CNDCOV(): conditional covariariance matrices (time series)
 Newsletter July 2006
 CANALS(): canonical correlation analysis of two sets
of variables (Gifi, 1990)
 CUCLCR(): cubic cluster criterion and R2 for
given cluster decomposition (Sarle, 1983)
 DEMREG(): univariate Deming regression
 DIXONR(): pdf, cdf and critical values for Dixon's r
 FICA(): (Fast) Independent Component Analysis
 HISTPLOT(): plotting a histogram
 HOMALS(): homogeneity analysis of categorical data (Gifi, 1990)
 ITA(): classical and inductive item tree analysis (Schrepp, 2006)
 OVERALS(): canonical correlation analysis of more than
two sets of variables (Gifi, 1990)
 PRIMALS(): onedimensional homogeneity analysis of
categorical data (Gifi, 1990)
 SDD(): SemiDiscrete Decomposition (Kolda & O'Leary, 1999)
 SGMANOVA(): multivariate analysis of variance based on
spatial signs signs (robust estimation method)
 XYPLOT(): plotting a XY diagram
 ZOVERW(): probability and density of the normal distributed
ratio z/w of normal distributed variables z and w (Marsaglia, 2006)
 Newsletter December 2006
 BYTE(): transfers integers into characters using the ASCI table
 BRANKS(): computes tied and bivariate ranks
 COVLAG(): computes autocovariance estimates for a vector time series
 GEE(): generalized estimation of equations (Liang and Zeeger)
 RANKTIE(): averaging tie ranking of entries of a vector
 ROCCOMP(): test the equality of areas under ROC curve (DeLong et al, 1988)
 SCAD(): Smoothly Clipped Absolute Deviations (Fan \& Li, 2002)
for LS regression, PH regression, and SVM
 SMSVM(): (structured) multicategory SVM (Lee, LIN, and Wahba, 2003)
 TOEPLITZ(): generates (block) Toeplitz matrix
 Newsletter July 2007
 Language extension for multidimensional arrays and
implementing tensor operations
 Language extension for lists of data objects
 CONST(): multidimensional extension of CONS()
 COVSHRK(): shrinking the covariance matrix for outliers in data
 DIM(): returning sizes of dimension of data objects (vectors, matrices, tensors)
 DIMLABEL(): assigning or pulling labels from tensor dimensions
 DIMNAME(): assigning or pulling names from tensor dimensions
 LOC(): returning index locations of specific data entries
 MAT2TEN(): create tensor from list of matrices
 RANDT(): multidimensional extension of RAND()
 TEN2MAT(): create list of matrices from tensor
 TEN2VEC(): move entries of tensor into data vector
 VEC2TEN(): move entries of data vector into tensor
 TENPERM(): reorder (permute) tensor dimensions
 TENTVEC(): multiply tensor with vector or list of vectors
 TENTMAT(): multiply tensor with matrix or list of matrices
 TENTTEN(): multiply tensor with tensor
 Extension to MAX() and MIN() functions
 Extension to SCAD function
 Extension to SMSVM() function (Lee, Lin, and Wahba, 2003)
 Extension to SVM() function
 Two types of index processing in matrices
 Fixing bugs for new release 4
 Newsletter December 2007
 DIM(): returns dimensionality of data object
 IRTML(): maximum likelihood IRT (item response theory)
 IRTMS(): Mokken scale IRT
 MDS(): multidimensional scaling and unfolding
 SETDIFF(): set difference of two data objects
 SETISECT(): set intersection of two data objects
 SETMEMBR(): returns binary membership of the entries of one object in another one
 SETUNION(): set union of two data objects
 SETXOR(): set XOR (eXclusive OR) of two data objects
 SIZE(): returns size in dimensions of data object
 SORTROW(): sorts rows of a data matrix w.r.t to sorting key
 UNIQUE(): returns the the unique entries of a data object
 VEC2TRI(): moves vector into compact triangular matrix
 Newsletter July 2008
 CODAPP(): applying the complete orthogonal decomposition
 HBADDTST(), HBANOVA(), HBBARTLETT(), HBCOVAR(), HBDISCRIM(),
HBLRG(), HBLTST(), HBRCMP()
 SELC():
 URD1OUT():
 Extension of MIN() and MAX() functions for more than two arguments
 Extension of IRTML(): input probability data and R1 measure
 extensions for COD(), MDS, and ODE() functions
 Extension of NLP() and NLE(): more return arguments
 Extension of NLP(): grid search
 Extension of NLP(): new conjuage gradient techniques: BirginMartinez
and scaled PR and FR
 Many test examples for NLP() in Part II
 Newsletter December 2008
 Extension of NLP(): UOBYQA and NEWUOA algorithms
 Extension of NLP(): Nonsmooth BT algorithms
 LOG2(): logarithm w.r.t. base 2
 ENCRYPT() and DECRYPT(): for encrypting files and directories
 LOCATN(): algorithms for the optimal location assignment problem
(greedy algorithm and Langrangean relaxation)
 NLFIT(): data mining using stagewise nonlinear regression
using sets of activation and link functions
 NLFITPRD(): scoring a data set using the model from NLFIT()
 Some algorithms for normalizing microarray data
 Comparing the performance of random generators for normal distribution
 Application of the location assignment for matching the
performance of an index fund
 Testing CMAT for release 5 in January 2009
 Newsletter July 2009
 Fixed bugs in MRAND()
 Extension to the CMAT language:
Adding kdimensional trees to the set of data objects:
 KDTCRT(): create kdimensional tree from data matrix
 KDTNEA(): obtain nearest neighbor nodes of kD tree
 KDTRNG(): obtain nodes of kD tree inside ball with specified radius
 Extensions to AFFVSN():
 Extensions to GLIM(): Hosmer and Lemeshow Test
 Extensions to CLUSTER(): more and better returns, some plotting
 Extensions to TOEPLITZ(): Levinson, Trench, and Durbin algorithm added
for solving the YuleWalker equations
 Extensions to UNIVAR(): many more location and scale measures
 AFFRMA(): Robust Multichip Average (Bolstad et.al, 2003) (not finished yet)
 ARIMA(): is still in the works
 ARMCOV(): modified covariance method for liner time series prediction
 BURG(): compute moving average whitening filter using method by Burg (1968)
 LDP(): linear distance programming (Lawson and Hanson, 1995)
 LOESS(): multivariate robust locally weighted regression (Cleveland, Grosse, and Shyu, 1992)
 LOWES(): univariate robust locally weighted regression (Cleveland, 1979)
 MEMPSD(): compute power spectrum of autoregressive filter
 POLYFIT(): fitting the polynomial model
 POLYVAL(): evaluating the polynomial model
 PPPD(): compute percentage points of Pearson distribution
 PWELCH(): compute power spectrum by periodogram method (Welch, 1967)
 SAMPLE(): equal and unequal probability sampling with or without replacement
 SORTP(): partial sorting for quantile
 STAND(): columnwise standardization of numeric matrix wrt. location and scale
 TSLOCFOR(): forecasting zero or first order local model
 TSLOCTST(): error testing of zero or first order local model
 TSMEAS(): for a large variety of time series measurements
 TSTRANS(): for a variety of time series data transformations,
surrogates, and filters
 X11(): seasonal smoothing of monthly or quarterly time series data
 Some functions for combinatorics:
 COMBN(): generate all combinations of m elements taken n at the time
 DMNOM(): density of multinomial distribution
 HCUBE(): generate all points on hypercube lattice
 RMULT(): random generator for multinomial distribution (similar to MRAND())
 NSIMPLEX(): get number of points on (p,n) simplex
 XSIMPLEX(): generate all points on (p,n) simplex
 Illustrating Hosmer and Lemeshow Test
 Newsletter December 2009
 Extensions to NLP(): new option for nonlinear constraints
and new methods (e.g. Simulated Annealing, subgradient methods)
 Extensions to TSMEAS(): new functions added:
 Partial autocorrelations with robust asymptotic standard errors
 LjungBox test for serial correlation
 (Robust) LM test for serial correlation
 NeweyWest covariance matrix
 OLS Regression with NeweyWest (HAC) as. standard errors
 (Augmented) DickeyFuller testing
 Granger causality testing (LikelihoodRatio, LM, and Wald inference)
 Vector AR modeling (homo and heteroskedastic, correlated and uncorrelated errors)
 Impulse response modeling for specified lead
(homo and heteroskedastic, correlated and uncorrelated erros)
 Extensions to TSTRANS(): new transformations added :
 BoxCox transform
 Lag and Log transform
 BaxterKing filtering
 HodrickPrescott filtering
 Extensions to KSTEST(): many new distributions added, additional returns
 BERKOW(): Berkowitz testing for time series or cross sectional data for distributions like KSTEST()
 ARMA(): ML estimation of the AutoRegressive Moving Average model
 ARMAFORE(): Forecasting using ARMA model estimates
 ARHETERO(): Heterogeneous AR model estimation
 GARCH(): ARCH, GARCH, TARCH, AVARCH, ZARCH, APARCH, EGARCH, AGARCH,
NAGARCH, IGARCH, FIGARCH model estimation
 JARBERA(): JarqueBera test for normal distribution
 MUCOMP(): comparing different hypotheses (model restrictions)
for linearly constrained ("confirmatory") ANOVA (Kuiper, Klugkist, & Hojtink)
 SHAPWILK(): ShapiroWilk test for normal distribution
 Newsletter July 2010
 Extending val = MAX(a,...) to < val,ind > = MAX(a,...) and the same for MIN()
 Extending NNMF() for symmetric nonnegative matrix factorization, C=HH' and C=HSH'
 Extending NNMF() for (left, right, and bi) orthogonal nonnegative matrix factorization, C=UH' and C=USH'
 Extending NNMF() for orthogonal symmetric nonnegative matrix factorization, C=HH' and C=HSH'
 Creating a large number of links for the Reference Manual
 The gnuplot ... gpend syntax for interactive gnuplot input
 BORUTA(): Variable selection algorithm (wrapper of Random Forest; Kursa & Rudnicki)
 GPBATCH(): Running gnuplot input scripts in batch mode
 HELP(): Opening the Reference Manual at specific bookmark terms
 < val,ind > = MAXN(a,n) and < val,ind > = MAXN(a,n) for the n largest resp.
smallest values of a
 RANFOR(): Random Forest algorithm for Classification and Regression (Breiman)
 RAFPRD(): Scoring the Random Forest model for Classification and Regression
 PROPURS(): Projected Pursuit PCA (Friedman & Tukey)
 SURVCURV(): Survival curves: Adjusted for Cox PH and Aalen's Model (Zhang et.al, 2007)
 SURVREG(): Survival regression: Cox proportional Hazards Model, Aalen's additive model, GLIM models
(extreme, logistic, Gaussian; Weibull, loglog, lognormal, exponential, Rayleigh)
 SYSTEM(): Execute shell commands, save output in string data
 SPAWN(): Execute child process (with or without reentry or running concurrently)
 ZIP7(): (Encrypted) Compressing and decompressing using the 7zip program
 Newsletter December 2010
 Modifying LRFORW(): options matrix input argument
 Extending LOC(): for indices of missing values
 Extending NLREG(): permitting simple boundary, linear, and nonlinear constraints and specifying derivatives
 AUROC(): area under the ROC curve with asymptotic standard errors
 DELTA(): Delta method for computing asymptotic standard errors
 HBTTEST(): various forms of t test
 LRALLV(): all variables subsets regression (full enumeration and stochastic search)
 NOBLANKS(): removing leading and trailing blanks in string data
 ORDER(): hierarchical ranking in tied obsservations (similar to R function)
 SMP(): stochastic matching pursuit and componentwise Gibbs sampler for variable selection (Chen et.al)
 SURVCURV(): Survival curves: common types: KaplanMeier, FlemingHarrington, Tsiatis, Aalen, KalbfleischPrentice, Greenwood etc.
 SURVFOR(): Survival forest (not finished yet)
 SURVPRD(): Survival regression test set scoring (prediction and residuals)
 Newsletter July 2011
 BIDIMREG(): Bidimensional Regression between 2dimensional configurations (Tobler, 1994)
 CFA(): Confirmatory Factor Analysis (categorical data, robust, with automatic model search)
 DEG2RAD() and RAD2DEG(): conversion between degrees and radians
 INVUPD(): rank1 update of the inverse of a pd matrix
 SOM(): SelfOrganizing Maps, supervised Kohonen networks
 Newsletter December 2011
 Now the input of hexadecimal numbers is permitted (transformed into unsigned long integers)
 Extended REPLACE(): for multiple replacements in scalars, vectors, matrices, or tensors
 Optional second argument for SRAND() for initialization of specific uniform generators
 Added: uniform random generators to RAND() function:
 MersenneTwister(Matsumoto and Nishimura, 1998)
 Advanced Encryption Standard (ASE) (Rijndael)
 GFSR4 (Ziff, 1998)
 RANLUX (Luescher, 1994)
 Worked over: MRAND() for multivariate random number generation
 New distributions for multivariate random number generation: t(mu,sigma,df), Pearson, Khintchine
 Worked over tests for univariate normality: KolmogorovSmirnov, AndersonDarling, ShapiroWilks,
JarqeBera
 MVNTEST(): Various tests of multivariate normality: Mardia's tests of MV skewness and kurtosis,
Royston, HenzeZirkler, DoornikHansen, Small, Mudholkar, etc.
 CPERM(): permutation of columns of a matrix
 RPERM(): permutation of rows of a matrix
 ASSOC(): data mining items (Agrarwal, Imielinski, and Swami, 1993)
 RULES(): data mining items (Agrarwal, Imielinski, and Swami, 1993)
 SEQU(): data mining items (Agrarwal, Imielinski, and Swami, 1993)
 HANKEL(): create Hankel matrices
 CHNGTXT(): changes in string data (scalars, vectors, matrices, and tensors)
 CONVHULL(): convex hull of 2, 3, ddimensional point configurations (Barber, Dobkin, and Huhdanpaa, 1996)
 DELAUNAY(): Delaunay triangulation (Barber, Dobkin, and Huhdanpaa, 1996)
 VORONIN(): Voronin diagrams (Barber, Dobkin, and Huhdanpaa, 1996)
 Newsletter July 2012
 Worked over: ENCRYPT() and DECRYPT(): more methods (AES, SHA2,...) and vector input for names of files and directories
 CSVREAD(): Reading of CSV (CommaSeparatedValues) files
 ENCRYP2(), DECRYP2(): similar to ENCRYPT() and DECRYPT(), however treats string objects (not files or directories)
 RECUPAR(): Recursive partitioning (tree splitting similar to SAS Macro TREEDISC, but much faster)
 INSERT(): with arguments compatible with SAS/IML function
 MIXREGV(): implementation of Don Hedeker's Fortran program MIXREGLS
 REMOVE(): with arguments compatible with SAS/IML function
 Almost all string functions (str...) extended for vector, matrix, and tensor arguments

 Newsletter December 2012 and July 2013
 Changed to MS Visual C/C++ 2010 and Intel Parallel Studio XE 2013
 Worked over polychoric correlations for multiple sample applications,
and functions HBTTEST(), SORTROW(),
 ENCRYPT(): added SHA3 method to
 HORNER(): efficient evaluation of polynomials
 KDE() : 1 and 2dimensional kernel density estimation
 MVSVM(): stepwise multivariate SVM (Thayanathan, 2005)
 PERMCOMB(): permutation and combination (stepwise)
 PREFNAME(): generate sets of prefix names
 RVM(): Relevance vector machine
 SDCSPM(): testing Srivstava's condition
 Application: TwoPhase Logistic Modeling
 Newsletter December 2013
 Worked over TTEST()
 Worked over linear regression: REG(), LRFORW(), and LRALL()
 Worked over general linear modeling: GLMOD(), e.g.
added multiple comparison techniques (Interfacing MULTCOMP())
 Worked over multivariate normal and t probabilities: CDFMVN()
 ICDFMV(): inverse CDF for multivariate normal and t distribution (given prob find quantile)
 MAHALANOBIS(): Mahalanobis distances (diagonal and full)
 MULTCOMP(): various parametric and nonparametric methods for
multiple comparison of means and medians of K>2 samples
 PADJUST(): adjusted multivariate probabilities (e.g. Bonferroni, Holm, Hochberg)
 PDFMV(): PDF for multivariate normal and t distribution
 WILCOX(): Wilcoxon rank sum test and signed rank test comparing two samples
 Newsletter July 2014
 Worked over LP(): interface with LPSOLVE,
now with integer constraints and sensitivity analysis
 Added to NLP(): LINCOA (LINear Constraint Optimization Algorithm) to NLP() (M.J.D. Powell)
 BOUNDBOX(): compute smallest rectangular box surrounding a specified set of points
 HAMILTON(): find all or some Hamiltonian circuits in directional graphs
 KNAPSACK(): solving (approximately) the one and multidimensional Knapsack problem
 LATLONG(): various computations with Latitude and Longitude data
 LOCAT1(): solves multifacility location problem
 LPASSIGN(): solving the linear assignment problem with LPSOLVE, LAPJV, or
the Hungarian method and solving the linear bottleneck assignment problem with BOTJV
 LPTRANSP(): solving the linear transport problem with LPSOLVE
 MAXEMPTY(): finds coordinates and volume of largest empty box parallel to (x,y) axes
and surrounded by a specified set of points
 MSTGRA(): Minimum Spanning Tree based on graph data (MSTREE was renamed into MSTDIS
for Minimum Spanning Tree based on distance data)
 SPLNET(): compute shortest path (length) between two points of network (graph) data
 TSP(): various methods for solving the symmetric and asymmetric Traveling Salesman Problem
(Interface to Linkern and Concorde)
 Newsletter December 2014
 FILESTAT(): returns vector of file statistics
 MPSFILE(): transforming MPS file to matrix notation of LP specification
and vice versa
 PRITFILE(): transfer text file rowwise into vector of strings
 CONTSIM(): Random Generation of contingency tables with specified row and column sums
 All LP algorithms worked over
 Interface to Clp (Coin Linear Programming) and Cbc (Coin BranchandCut) in COINOR
Some "Exact Statistics" for small samples:
 XCTBINOM(): exact binomial test: p values and confidence intervals
 XCTBIP1() : P(alt) for exact binomial test
 XCTBIPOW(): Power for exact binomial test
 XCTBISSZ(): Sample size for exact binomial test
 XCTPOISS(): exact Poisson test: p values and confidence intervals
 XCTFISHR(): Fisher's exact test for 2 x 2 contingency tables
 XCTFIPOW(): Power for Fisher's exact test for 2 x 2 contingency tables
 XCTFISSZ(): Sample size for Fisher's exact test for 2 x 2 contingency tables
 XCTLOG(): MCMC Method for exact logistic regression (see elrm)
 XCTMCNEM(): McNemar's exact test for 2 x 2 contingency tables
 XCTSIMU(): MC method of Fisher's exact test for m x n contingency tables
 XCTHYBR(): Hybrid method for Fisher's exact test of m x n contingency tables
 Newsletter December 2015
 Extending data types to Structs, entries are referred by
compound name separated by dot: struct_name.entry_name
 RANK() function extended for very large and sparse matrices
 Work on preprocessor for LP() function
 CLP interface for SVM() regression (Bi et.al, 2002) and SMSVM (Lee, Lin, and Wahba, 2003)
 RCCOUNT(): count the occurence of numeric or string values in
rows and columns of a matrix
 SOUND(): playing sound (of specified frequency and duration)
on speakers
 Newsletter July 2016
 This is now the very carefully tested version 9 ready for
downlowd moved to the internet at the end of September 2016.
Maybe it is the most error free since version 4.
 Work on printed output of tensors and lists
 Work on function SVM() for parameter tuning
 FNAMPID(): concatenate filename and actual process ID
 FREMOVE(): remove file with specified name
 FRENAME(): rename file
 ERROR(): print error message into log
 LSTLABEL(): assigning labels to entries of data lists
 LSTNAME(): assigning names to entries of data lists
 OUTLIER(): various methods for finding outliers in univariate data
 PID(): returns integer process ID
 SVMFSM(): SVM onestep feature selection (sparse L1 model fit)
 SVMSTW(): SVM stepwise feature selection (forward and backward, for only linear kernel)
 WARNING(): print warning message into log
 Rename objects with RENAME statement
Now, CMAT got some attention at
Dilbert
Algorithm, my delight
Running at the speed of light
What the genius, who the geek
Could forge thy objects ever sleek?
In what language were thee writ
That enabled every bit?
What the templates  how applied
And what the math down deep inside?

Thou wert born with MPI
And that enabled thee to fly
On Beowulf we set thee free
All cycles thou consumed with glee.
When the stars come out at night
And ask a sacrificial rite
Do we just sneer and charge ahead
With no fear and with no dread?

All this wonder; all this speed
And yet I wait here still in need
Alas the run's untimely halt
Said naught but that some seg did fault.
Algorithm, my delight
Running at the speed of light
What the genius, who the geek
Could forge thy objects ever sleek?

Contact information:
Back to Homepage