|
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||
java.lang.Objectrecognizer.PersonalityRecognizer
public class PersonalityRecognizer
The program computes features described in (Mairesse & Walker, 2006)
given a text, and it runs Weka models on the features to produce
personality scores for all Big Five dimensions.
The MRC Psycholinguistic database and the LIWC tool need to be installed,
and the file PersonalityRecognizer.conf in the main directory needs to
be modified accordingly. The PersonalityRecognizer script should be used
for launching the program.
Usage: PersonalityRecognizer [-d] [-m model_number] [-o] [-c] [-t model_type] -i file|directory
-c,--counts Also outputs feature counts, -d must be disabled
-d,--directory Corpus analysis mode. Input must be a directory with
multiple text files, features are standardized over
the corpus and the recognizer outputs a personality
estimate for each text file.
-i,--input Input file or directory (required)
-m,--model Model to use for computing scores (default 4). Options:
1 = Linear Regression
2 = M5' Model Tree
3 = M5' Regression Tree
4 = Support Vector Machine with Linear Kernel (SMOreg)
-o,--outputmod Also outputs models
-t,--type Selects the type of model to use (default 1). The appropriate
model depends on the language sample (written or
spoken), and whether observed personality (as perceived
by external judges) or self-assessed personality (the
writer/speaker's perception) needs to be estimated from the
text. Options:
1 = Observed personality from spoken language
2 = Self-assessed personality from written language
See the included readme file and the website
http://www.dcs.shef.ac.uk/~francois/personality/recognizer.html
for more information.
Questions can be emailed to the author at F.Mairesse@sheffield.ac.uk.
Reference paper:
Francois Mairesse and Marilyn Walker.
Words Mark the Nerds: Computational Models of Personality Recognition through Language.
In Proceedings of the 28th Annual Conference of the Cognitive Science Society
(CogSci 2006), Vancouver, July 2006.
Available on the web in PDF format at
http://www.dcs.shef.ac.uk/~francois/papers/cogsci06.pdf
| Field Summary | |
|---|---|
static java.io.File |
DEFAULT_CONFIG_FILE
Configuration file (default is PersonalityRecognizer.conf in root application directory). |
static java.lang.String[] |
DIMENSIONS
Personality dimensions names. |
static java.lang.String |
FS
File separator. |
static java.lang.String |
LS
Line separator. |
| Constructor Summary | |
|---|---|
PersonalityRecognizer()
Initializes parameters based on the default configuration file (PersonalityRecognizer.properties). |
|
PersonalityRecognizer(java.io.File propFile)
Initializes parameters based on configuration file, and loads the LIWC dictionary and the MRC database in memory. |
|
| Method Summary | |
|---|---|
java.util.Map<java.io.File,java.lang.Double[]> |
computeScoresOverCorpus(java.io.File dir,
weka.classifiers.Classifier[] models)
Runs the models of each personality trait for each file in the directory. |
java.util.Map<java.lang.String,java.lang.Double> |
getFeatureCounts(java.lang.String text,
boolean relativeOnly)
Computes the features from the input text (70 LIWC features and 14 from the MRC database). |
int |
getModelIndex()
Gets the current default model index. |
int |
getModelIndex(java.lang.String modelDir)
Gets the model index in the MODEL_NAMES array from a string representation. |
weka.classifiers.Classifier[] |
loadWekaModels(boolean selfModel,
boolean stdModels)
Loads saved Weka models in memory for all personality dimensions, using the default model type. |
weka.classifiers.Classifier[] |
loadWekaModels(int modelIndex,
boolean selfModel,
boolean stdModels)
Loads saved Weka models in memory for all personality dimensions. |
static void |
main(java.lang.String[] args)
Main method that initializes the parameters from the configuration file, counts the features from the input text(s), run the specified Weka models for this feature set for each Big Five personality traits, and returns the personality score estimates to the standard output. |
void |
printOutput(weka.classifiers.Classifier[] models,
double[] scores,
int modelIndex,
boolean printModels,
boolean self,
java.io.PrintStream out)
Prints personality scores to standard output, and model details if required. |
void |
printOutput(weka.classifiers.Classifier[] models,
java.util.Map<java.io.File,java.lang.Double[]> scores,
int modelIndex,
boolean printModels,
boolean self,
java.io.PrintStream out)
Prints personality scores of multiple files to standard output, and model details if required. |
double[] |
runWekaModels(weka.classifiers.Classifier[] models,
java.util.Map<java.lang.String,java.lang.Double> counts)
Runs each Weka model on a new instance created from the input feature counts, and outputs the resulting personality score. |
void |
setModel(int modelIndex)
Sets the default Weka model to load when calling loadWekaModels(). |
void |
setModel(java.lang.String modelDir)
Sets the default Weka model to load when calling loadWekaModels(). |
| Methods inherited from class java.lang.Object |
|---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final java.io.File DEFAULT_CONFIG_FILE
public static final java.lang.String[] DIMENSIONS
public static final java.lang.String FS
public static final java.lang.String LS
| Constructor Detail |
|---|
public PersonalityRecognizer()
public PersonalityRecognizer(java.io.File propFile)
propFile - configuration file in ASCII format ( VARIABLE = "VALUE"
on each line).| Method Detail |
|---|
public java.util.Map<java.io.File,java.lang.Double[]> computeScoresOverCorpus(java.io.File dir,
weka.classifiers.Classifier[] models)
dir - input directory containing multiple text files.models - models of each Big Five personality trait.
public java.util.Map<java.lang.String,java.lang.Double> getFeatureCounts(java.lang.String text,
boolean relativeOnly)
throws java.lang.Exception
text - input text.relativeOnly - do not return absolute count features (WC), must be set to false if
standardized features are used (corpus analysis mode).
java.lang.Exceptionpublic int getModelIndex()
public int getModelIndex(java.lang.String modelDir)
modelDir - the model subdirectory in the MODEL_DIRS array corresponding
to the model to load.
public weka.classifiers.Classifier[] loadWekaModels(boolean selfModel,
boolean stdModels)
selfModel - if set to true, loads the self-report models.stdModels - if set to true, loads the standardized models.
public weka.classifiers.Classifier[] loadWekaModels(int modelIndex,
boolean selfModel,
boolean stdModels)
modelIndex - the index of the element in the MODEL_DIRS array corresponding
to the directory of the model to load.selfModel - if set to true, loads the self-report models.stdModels - if set to true, loads the standardized models.
public static void main(java.lang.String[] args)
args - set of options and input file(s).
public void printOutput(weka.classifiers.Classifier[] models,
double[] scores,
int modelIndex,
boolean printModels,
boolean self,
java.io.PrintStream out)
models - array of Weka models.scores - array of personality scores to print.modelIndex - index of the model used in the MODEL_NAMES array.printModels - if true, prints out a textual representation of the models.out - output stream.
public void printOutput(weka.classifiers.Classifier[] models,
java.util.Map<java.io.File,java.lang.Double[]> scores,
int modelIndex,
boolean printModels,
boolean self,
java.io.PrintStream out)
models - array of Weka models.scores - map associating each file to an array of personality scores to print.modelIndex - index of the model used in the MODEL_NAMES array.printModels - if true, prints out a textual representation of the models.out - output stream.
public double[] runWekaModels(weka.classifiers.Classifier[] models,
java.util.Map<java.lang.String,java.lang.Double> counts)
models - array of Weka models (Classifier objects).counts - mapping of feature counts (Double objects), it must probide
a value for all attribute strings of the input models.
public void setModel(int modelIndex)
modelIndex - the index of the element in the MODEL_DIRS array corresponding
to the directory of the model to load.public void setModel(java.lang.String modelDir)
modelDir - the model subdirectory in the MODEL_DIRS array corresponding
to the model to load.
|
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||