diff --git a/docs/report/report-en.pdf b/docs/report/report-en.pdf
index 856dd40..c36fcf1 100644
Binary files a/docs/report/report-en.pdf and b/docs/report/report-en.pdf differ
diff --git a/docs/report/report-en.tex b/docs/report/report-en.tex
index 6024d65..420b03b 100644
--- a/docs/report/report-en.tex
+++ b/docs/report/report-en.tex
@@ -1,757 +1,706 @@
\documentclass[a4paper,12pt]{article}
\usepackage[a4paper,left=3cm,right=3cm,top=3cm,bottom=3cm]{geometry}
\usepackage[english]{babel}
\usepackage[parfill]{parskip}
\usepackage{graphicx}
\usepackage{xeCJK}
\setCJKmainfont{Songti SC Light}
\usepackage{amssymb}
\usepackage{amsmath}
\usepackage{amsthm}
\usepackage{xunicode}
\usepackage[utf8]{inputenc}
\usepackage[charter]{mathdesign}
\usepackage{url}
\usepackage{hyperref}
\usepackage{multirow}
\usepackage[toc,page]{appendix}
\usepackage{tabularx}
\usepackage{longtable}
\usepackage{listings}
\lstset{basicstyle=\footnotesize\ttfamily,breaklines=true, upquote=true}
\usepackage{textcomp}
\usepackage{graphicx}
\usepackage{subfig}
\usepackage[labelfont={small,sc}, font={small}]{caption}
\DeclareTextCommand{\nobreakspace}{T1}{\leavevmode\nobreak\ }
\title{Large-scale Programming Language Detection}
\author{Yuan YIN}
\date{}
\begin{document}
\maketitle
\begin{abstract}
(to be completed)
\end{abstract}
\tableofcontents
\section{Introduction}
-Programming Language Detection is a problem of identifying which programming language is a piece of source code written in. We here define the piece of source code as a textual sequential representation of an artefact, which is normally in the form of character sequence or, more generally, byte sequence. More precisely, the objective is to build a model which could predict the language of a given byte sequence.
+Programming Language Detection is a problem of identifying which programming language is a piece of source code written in. We here define the piece of source code as a textual sequential representation of an artefact, which is normally in the form of character sequence or, more generally, byte sequence. More precisely, the objective is to build a model that predicts the language of a given sequence.
-The formal definition of the problem as follows: on the input, given a byte sequence $d$ and the number of a languages $n$,
+The formal definition of the problem as follows: on the input, given a byte sequence $d$ and $n$ languages,
\[l_d = \underset{l_i\in \{l_1, ..., l_n\}}{\arg \max}\ m(d, l_i),\]
where $l_d$ is the projected language, model $m$ calculates a value indicating the likelihood of a document written in language $l_i$ and the most likely one is chosen as the recognised language of the document.
In general, Programming Language Detection could be utilised in different situations, here are several example applications: language composition of software project in version control systems. For example, GitHub team is developing the project Linguist to return which languages are the project written in; code searching in plain text, in order to track the popularity of a language; language detection helps also IDEs to choose the language whose support functionalities, like syntax highlighting, are implemented.
-We dive into this problem in the context of \emph{Software Heritage}. \emph{Software Heritage}, initiated by Inria, is an archive in which 4 billions source code files from 80 millions projects are stored.
+We dive into this problem in the context of \emph{Software Heritage}. \emph{Software Heritage}, initiated by Inria, is an archive in which 4 billions source code files from 80 millions projects are stored. The ``large scale'' profile of the problem is not only about the size of archive but also the scale of number of languages.
The reason why the language detection is requested by \emph{Software Heritage} is that the language of a file could not be found in its filename extension. In \emph{Software Heritage}, every source code file is a blob which contains raw content of the file, that means a sequence of bytes without any extra information, such as filename (including filename extension), metadata, \emph{etc}. Since each blob could be represented by an intrinsic identifier generated from the blob itself, the duplication of files is avoided. For this reason, all existing tools depending on filenames fail in our context, and the methods for recognising the language from a sequence of bytes is strongly demanded.
-(To be fixed after the redaction)
+In this report, we introduce briefly the state-of-the-art methods in Section 2. In Section 3, the procedure of making a feasible dataset is related. In Section 4, we explain the methods that we took in account for the evaluation. Experimental results including the comparison between methods and the observation on certain questions are described in Section 5. The best performing method is finally applied to a subset of \emph{Software Heritage}, Section 6 gives a preview of its performance in the real world. Section 7 draws several possible tracks for the future amelioration of the tools.
-In this report, we introduce briefly the state-of-the-art methods in Section 2. In Section 3, the procedure of making a feasible dataset is related. In Section 4, we explain the methods that we took in account for the evaluation.
-
-We provide the implemented methods and more detailed results on Forge of \emph{Software Heritage} \footnote{\url{http://}}.
+We provide the implemented methods and more detailed results on Forge of \emph{Software Heritage} \footnote{\url{http://??}}.
\section{Related Works}
The existing approaches could be divided into two categories: practical methods and machine learning methods.
Practical methods are mostly based on several empirical or external information, basic ideas are presented as follows:
\begin{itemize}
\item Judging from filename extension. Ohcount\cite{ohcount} and Linguist\cite{linguist} practice the detection by hashing filename extension. The problem from this straightforward method is that some extensions are related to different languages, \emph{e.g.} \texttt{*.m} refers to a file written in Objective-C or MATLAB, \texttt{*.pl} points to Python or Prolog.
\item Grammar-based approaches. The principal is to parse through all languages, which is complex in modelling and demand an heavy consumption of calculation time.
\item Heuristics approaches. Most of them, such as SLOCCount\cite{sloccount}, use predefined regular expressions to capture empirically discovered features, \emph{e.g.} a file start with ``\texttt{\#include}'' is probably written in C. Some other looks for hints in the file, such as shebang lines, Vim modelines, Emacs modelines, \emph{etc}.
\end{itemize}
In Machine learning, the problem is regarded as a sub-problem of \emph{text categorisation} or \emph{text classification}, which means that given a piece of text, we find a function that predicts which category the text belongs to. The state-of-the-art methods build such function based on example input-output pairs, which are categorised as \emph{supervised learning}.
-Ugurel \emph{et al.} \cite{Ugurel02} selects firstly the features by Expected Entropy Loss for each language, then vectorise the tested document into a vector representing the presence of a selected feature. Since Support Vector Machine (SVM) is binary classifier, the $n$-class classification is resolved by training $n \choose 2$ SVMs in the form of decision tree. Van Dam and Zaytsev \cite{vanDam16} test several popular and performant methods in Natural Language Processing. Multi-nominal Naïve Bayes (MNB), one of the variants of Naïve Bayes Classifiers, utilises unified frequency of a word or a sequence of words in a byte-sequence to decide the most possibly corresponding programming language. $N$-gram model and skip-gram model calculate for each gram the possibility of its appearance after $N$ grams. Normalised Compression Distance compares a piece of compressed code to the examples in the training set, then chooses the nearest language on as projection. MNB and $N$-gram model outperform others according to the experimental results. Gilda\cite{Gilda17} adopts a general setup of Convolutional Neurone Network (ConvNet) in NLP and proofs its performance.
+Ugurel \emph{et al.} \cite{Ugurel02} selects firstly the features by Expected Entropy Loss for each language, then vectorise the tested document into a vector representing the presence of a selected feature. The $n$-class classification is resolved by training $n \choose 2$ Support Vector Machine (SVM) binary classifier in the form of decision tree. Van Dam and Zaytsev \cite{vanDam16} test several popular and performant methods in Natural Language Processing. Multi-nominal Naïve Bayes (MNB), one of the variants of Naïve Bayes Classifiers, utilises unified frequency of a word or a sequence of words in a byte-sequence to decide the most possibly corresponding programming language. $N$-gram model and skip-gram model calculate for each gram the possibility of its appearance after $N$ grams. Normalised Compression Distance compares a piece of compressed code to the examples in the training set, then chooses the nearest language on as projection. MNB and $N$-gram model outperform others according to the experimental results. Gilda\cite{Gilda17} adopts a general setup of Convolutional Neurone Network (ConvNet) in NLP and proofs its performance.
\section{Dataset}
-We considered either applying supervised learning and unsupervised learning for the problem. However, the usage of unsupervised learning is quite limited (we will talk about it later in Section 6). We then focus on supervised methods.
+We considered either applying supervised learning and unsupervised learning for the problem. However, the usage of unsupervised learning is quite limited in classification problems (we will talk about it later in Section~7). We then focus on supervised methods.
-Supervised learning methods require a dataset containing labeled inputs to train and to evaluate the model. Nowadays, since Programming Language Detection is not seriously considered as an important subject in machine learning, for the reason that it could be resolved by adopting existing classifiers of ML, the articles are rarely accompanied by a publicly available dataset. Therefore, we natively build a novel dataset for our experiments.
+Supervised learning methods require a dataset containing labeled inputs to train and evaluate the model. Nowadays, since Programming Language Detection is not seriously considered as an important subject in machine learning, for the reason that it could be resolved by adopting existing classifiers of ML, the articles are rarely accompanied by a publicly available dataset. Therefore, we natively build a novel dataset for our experiments.
-GitHub\footnote{\url{https://www.github.com/}} is one of the most popular web-based hosting service for Git version control system, reporting having more than 57 million repositories. We decide to build the dataset using GitHub.
+GitHub\footnote{\url{https://www.github.com/}} is one of the most popular web-based hosting service for Git version control system, reporting having more than 57 million repositories. We build the dataset using GitHub for its large scale of languages included and its popularity in free software community.
\paragraph{Ground Truth Supposition}
In the context of \emph{Software Heritage}, our aim is to cover as many languages as possible for classification, thus the dataset we build possesses inevitably a large amount of files, which is unaffordable to be labeled manually. We thus seek help from automatic labelling tools.
Linguist \cite{linguist} is the tool of language detection developed by the GitHub team for unveiling the language composition in git repository, service provided on GitHub through API. There exists a command line version Linguist producing list of files by language for repository. Given that filename extensions are visible for Linguist and such features boost enormously on accuracy of classification (we will show this claim in later experiment), we suppose that the language recognised by Linguist is the ground truth language attributed to it. Since the original Linguist did not give detailed results some data description languages, \emph{i.e.} XML, JSON, we slightly modified Linguist to integrate these missing languages.
\paragraph{Source Code Recuperation and Languages Included}
The dataset is built in the context of \emph{Software Heritage}. Therefore, the list of languages we consider integrating in the system covers as many languages as possible.
We initially took the entire language list of Linguist into account for repository fetching. For each language, we fetch the first 75 repositories which top on the list ordered by number of stars, manifesting the popularity of the repository. To avoid huge repositories, we ignore all repositories whose size is superior to 150~MiB.
-We then eliminate some languages, \emph{i.e.} data description languages, which we could not fetch any repository from GitHub. We successfully fetched 3,525,897 files for 323 valid languages showed in Table~\ref{tab:lan}.
+We then eliminate some languages, \emph{i.e.} data description languages, which we could not fetch any repository from GitHub. We successfully fetched 5,162,128 files for 374 valid languages showed in Table~\ref{tab:lan}.
\section{Methods for Evaluation}
In this section, we describe several NLP methods here tested on our dataset:
\begin{itemize}
\item $n$-gram-based frequency distance model,
\item $n$-gram model,
\item Multinominal Naïve Bayes (MNB), and
\item Convolutional Neurone Networks (ConvNet).
\end{itemize}
The first approach is regarded as a baseline method for the evaluation of the accuracy and the efficiency of the model.
Given that in \emph{Software Heritage} every file is only a sequence of bytes which we are not able to assert its encoding, even unable to judge whether it is a binary file or not, we are willing to discover the approaches at byte level.
\subsection{Baseline: $n$-gram-based frequency distance}
\paragraph{$n$-gram}
An $n$-gram is a slice of a larger sequence with $n$ units. In NLP, the sequence is naturally the string. Depending on different problems, an unit represents a character or a word.
For example, the string ``\texttt{print(n)}'' with 8 characters could generate following character based $n$-grams:
\begin{itemize}
\item unigrams: \texttt{p, r, ..., )}
\item bigrams: \texttt{\textvisiblespace p, pr, ri, ..., n), )\textvisiblespace}
\item trigrams: \texttt{\textvisiblespace\textvisiblespace p, \textvisiblespace pr, pri, rit, ..., n)\textvisiblespace, )\textvisiblespace\textvisiblespace}
\item ...
\end{itemize}
or word-based $n$-grams:
\begin{itemize}
\item unigrams: \texttt{, print, (, n, ), }
\item bigrams: \texttt{ print, print (, ( n, n ), ) }
\item trigrams: \texttt{ print (, print ( n, ( n ), n ) }
\item ...
\end{itemize}
Strings are often padded with start marker \texttt{} and end marker \texttt{}. In general, a $k$-unity sequence generates exactly $k-(n-1)$ n-grams.
Cavnar and Trenkle \cite{Cavnar94} introduce an early NLP method using the distance between two $n$-gram frequency profiles.
According to Zipf's law, an empirical observation expressing that the $n$-th most common word in a human language occurs with a frequency inversely proportional to $n$. By retaining the most common words, it is possible to obtain a list describing the characteristics of the language.
Given a training set, at the training phase, a bag of $n$-grams is generated for each document in the training set. By gathering all bags of a language and counting the occurrences of each $n$-gram, a list of $n$-grams ordered by number of occurrences is created as the \emph{category profile} of the class. Only the most frequent 300 $n$-grams are kept, since they are highly correlated to the language.
The \emph{distance} between category profile and document profile is defined as follows:
Given trained category profiles $p_{l_1}, ..., p_{l_k}$ for $k$ languages, and document profile $p_{d}$ of test document $d$,
\[
distance(p_{l_i}, p_{d}) = \sum_{w\in p_{d}} | rankdist(w, p_d, p_{l_i})|
\]
\[
rankdist(w, p_d, p_{l_i})=
\begin{cases}
|rank(w, p_d) - rank(w, p_{l_i})| & \text{if }rank(w, p_{l_i}) \text{ exists,} \\
|p_d| & \text{else}
\end{cases}
\]
where $p$ containing an ordered list of word, $rank(w, p)$ returns the rank of $w$ in list $p$. $rankdist(w, p_d, p_{l_i})$ returns the out-of-place distance between two profiles if $w$ appears in $p_{l_i}$. If $w$ is an out-of-vocabulary word, the distance is the length of document profile $p_d$.
We then categorise the document as language with minimum distance.
\subsection{Multinominal Naïve Bayes}
This approach is introduced by van Dam and Zaytsev \cite{vanDam16}.
We assume in Naïve Bayes model that each word of the document is independent from each other. According to Bayes' Theorem,
\begin{eqnarray*}
P(l|w_1w_2...w_n) & = & \frac{P(w_1w_2...w_n|l)P(l)}{P(w_1w_2...w_n)} \\
& = & c\cdot P(l) P(w_1w_2...w_n|l)\\
& = & c\cdot P(l) \prod_{i = 1}^n P(w_i|l)
\end{eqnarray*}
Probability of $w_i$ in language $l$ is estimated by its occurrences in language with bag-of-word assumption:
\[
P(w_i|l) = \frac{C_l(w_i) + 1}{\sum_{w\in V}C_l(w) + |V|}
\]
where $C_l$ gives frequency of a word, $V$ is the vocabulary all languages.
Assumption of independence of words is quite limited for classification, in practice we actually use unigrams to 5-grams to replace words in the original method for taking the context of words into account.
\subsection{$n$-gram model}
The approach is introduced by van Dam and Zaytsev\cite{vanDam16}. As the precedent method, $n$-gram model utilises also statistical properties of $n$-grams but in another way.
Originally, $n$-gram model aims at predicting the possibility of an unit after knowing $n-1$ units occurred before. Given an unit $w_i$, the probability of its occurrence in a sequence is defined as:
\[
P(w_i | w_1...w_{i-1})
\]
According to Markov assumption, we omit older context in the sequence,
\[
P(w_i | w_1...w_{i-1}) \approx P(w_i | w_{i-(n-1)}...w_{i-1})
\]
In reality, the probability could be estimated by maximum likelihood estimation (MLE):
\[
P(w_i | w_{i-(n-1)}...w_{i-1}) = \frac{C(w_{i-(n-1)}...w_{i-1}w_{i})}{C(w_{i-(n-1)}...w_{i-1})}
\]
where $C$ gives the count of given $n$-gram.
By chain rule of probability and precedent estimation,
\[
P(w_1w_2...w_n)\approx \prod_{i = 1}^n P(w_i|w_{i-(n-1)}...w_{i-1})
\]
Now we transform such model into a classifier. Given a sequence $w_1w_2...w_n$, we assume that each language $l$ appears with the same probability and the probability of a given sequence is fixed.
According to Bayes' Theorem,
\begin{eqnarray*}
P(l|w_1w_2...w_n) & = & \frac{P(w_1w_2...w_n|l)P(l)}{P(w_1w_2...w_n)} \\
& = & c\cdot P(w_1w_2...w_n|l)\\
& = & c\cdot \prod_{i = 1}^n P(w_i|w_{i-(n-1)}...w_{i-1}, l)
\end{eqnarray*}
Rather than counting $n$-grams in the document, the probability of $n$-gram is estimated from the $n$-gram frequency of language, obtained from training set.
\[
P(w_i | w_{i-(n-1)}...w_{i-1}, l) = \frac{C_l(w_{i-(n-1)}...w_{i-1}w_{i})}{C_l(w_{i-(n-1)}...w_{i-1})}
\]
where $l$ is $C_l$ gives the count of language $l$ in training set.
While estimating the probability of $n$-grams, the smoothing techniques are required because of possible occurrence of \emph{out-of-vocabulary (OOV)} $n$-gram. In our case, Modified Kneser-Ney is applied since it is one of the methods that gives better experimental results in \cite{vanDam16}.
\subsection{Convolutional Neural Network (ConvNet)}
Convolutional Neural Network is one of the most popular machine learning branch usually used for image classification. It is a class of deep feed-forward artificial neural networks.
The following two architectures are tested in Section 4.
\subsubsection{Word-level Approach}
\label{sec:word-conv}
Although Gilda \cite{Gilda17} shows the performance of his own architecture, we are not able to rebuild the same network due to the lack of network architecture details and hyper-parameter configuration. We move our vision to other architectures.
Kim \cite{Kim14} introduces a ConvNet for natural language sentence classification. Figure~\ref{fig:word-convnet} illustrates the architecture of the network.
\paragraph{Word Embedding}
In this architecture, word is the unit of the input. The $i$-th word $w_i$ is transformed into a vector $\mathbf{x}_i \in \mathbb{R}^k$ by word embedding level using \texttt{word2vec}. Word vectors are then concatenated to form the representation of the document, an $n\times k$ matrix.
The number of words $n$ of the document is fixed by the model. Therefore, a document longer than $n$ words needs to be pruned and the shorter one needs padding, by concatenating zero-vectors at the beginning or the end of the matrix.
\paragraph{Feature Extraction}
In the convolutional levels, by using a \emph{filter} $\mathbf{w_h} \in R^{hk}$, a \emph{feature} $c_i$ is then generated,
\[c_i = f(\mathbf{w_h}\cdot(\mathbf{x}_i\ ||\ \mathbf{x}_{i+1}\ ||\ ...\ ||\ \mathbf{x}_{i+h-1}) + b)\]
where $||$ is vector concatenate operator, $b\in \mathbb{R}$ is a bias term, $f$ is an \emph{activation function} outputting a feature from a set of inputs.
This procedure utilises the similar principle of $n$-gram model, but rather than extracting features from original words, ConvNet works on their vector representation.
Each filter produces a \emph{feature map}, a vector $\mathbf{c}^h\in \mathbb{R}^{n - h+1}$. A max-over-time-pooling is then applied on the feature map $\mathbf{c}^h$, aiming at choosing the most important features with the highest values and avoiding overfitting at training stage. We then obtain the final feature map of this $h\times k$ filter.
Several filters are often applied to obtain the corresponding feature map, representing a \emph{channel}. They are then concatenated vertically into a final feature map $\mathbf{c}$.
\paragraph{Classification}
\emph{Fully connected layer} is a traditional multi-layer perceptron whose neurons are all connected to every neurons of the precedent and following levels. It uses a softmax activation function in the output layer.
Feature map $\mathbf{c}$ is then put into a fully connected layer for extracting higher level features preparing for final classification. The output of these fully connected layers gives a vector indicating the score obtained for each class. The higher the score is given, the more possible the document is categorised into this class.
\subsubsection{Byte-level Approach}
Kim \cite{Kim15} introduces a character-level ConvNet for language modelling. The original architecture is adapted by Chaitanya Joshi\footnote{\url{https://github.com/chaitjo/character-level-cnn}} for achieving a classification model by replacing recurrent layers with same fully connected layers as word-level approach of Section~\ref{sec:word-conv}.
Instead of using word or token as feature, character-level approach could make use of character (or byte) without building a large vocabulary. Although the size of vocabulary is commonly considerably small, \emph{e.g.} 256 when we use every byte as character.
Feature extraction and classification are similar to the word-level approach.
\section{Experimental Results}
In this section, we present several questions that we are willing to answer by experiments on our customised dataset.
\subsection{Implementation and System Setup}
We implement the methods described in Section 4 in Python 3, in order to finally integrate one of them in \emph{Software Heritage}.
-Baseline method is implemented natively in Python. We implement MNB using Scikit-learn. $n$-gram model is implemented with KenLM \cite{kenlm}. The last two ConvNets are both implemented with Keras \cite{keras} using Tensorflow \cite{tensorflow2015-whitepaper} as backend.
+Baseline method is implemented natively in Python. We implement MNB using Scikit-Learn. $n$-gram model is implemented with KenLM \cite{kenlm}. The last two ConvNets are both implemented with Keras \cite{keras} using TensorFlow \cite{tensorflow2015-whitepaper} as backend.
-We execute principally the training and test phase on a portable computer with 2.7 GHz Intel Core i5 processor running macOS 10.3. The training phase of two ConvNet methods are executed in an instance running Ubuntu 16.04 with one Intel Sandy Bridge virtual CPU, equipped with one NVIDIA Tesla K80 GPU on Google Cloud Platform. The instance is configured for making use of Tensorflow backend with GPU acceleration using CUDA Deep Neural Network Library (cuDNN).
+We execute principally the training and test phase on a portable computer with 2.7 GHz Intel Core i5 processor running macOS 10.13. The training phase of two ConvNet methods are executed in an instance running Ubuntu 16.04 with one Intel Sandy Bridge virtual CPU, equipped with one NVIDIA Tesla K80 GPU on Google Cloud Platform. The instance is configured for making use of TensorFlow backend with GPU acceleration using CUDA Deep Neural Network Library (cuDNN).
\subsection{Training Set and Test Set}
Files of the training set are randomly picked from the dataset at the first time. To avoid the imbalance of the training set that impacts the performance of several methods in Section 4, we restrain the maximum number of training files to 500 for each language. The test set is then built from remaining samples, it includes up to 1000 files for testing.
We built 3 series of training set and test set of different sizes:
\begin{itemize}
\item \texttt{mini}: 20 languages in \cite{vanDam16} , 10,000 training files, 20,000 test files.
- \item \texttt{less}: 109 languages collecting more than 5,000 files in dataset, 54,500 training files, 109,000 test files.
- \item \texttt{total}: 323 languages in Table~\ref{tab:lan}, 136,609 training files, 248,924 test files.
+ \item \texttt{less}: 127 languages with more than 5,000 files collected in dataset, 63,500 training files, 127,000 test files.
+ \item \texttt{total}: 374 languages in Table~\ref{tab:lan}, 157,897 training files, 286,224 test files.
\end{itemize}
\subsection{Tokenisation}
In our case, tokenisation is useless for byte-level applications of method. The interest to introduce a simple general tokeniser is to break a document into words for making use of word-based methods.
It is difficult to summarise the relationship between programming language alphabet and its byte representation. We empirically suppose that most of the programming languages share some basic characters, \emph{e.g.} latin alphabet, parentheses, space, \emph{etc.} and most of encoding standards covers these characters in common.
A binary document is broken by a set of characters (operators, punctuations, spaces, \emph{etc.}) and numbers (integer, float, \emph{etc.}). All separators are retrieved after splitting.
For example, for the string ``\verb|print ("Hello world! 你好,世界!")|'' with UTF-8 encoding, its byte representation is
\begin{lstlisting}
"Hello world! \xe4\xbd\xa0\xe5\xa5\xbd\xef\xbc\x8c\xe4\xb8\x96\xe7\x95\x8c\xef\xbc\x81".
\end{lstlisting}
It is then tokenised to a sequence of 12 words:
\begin{lstlisting}
'print', ' ', '(', '"', 'Hello', ' ', 'world', '!', ' ', '\xe4\xbd\xa0\xe5\xa5\xbd\xef\xbc\x8c\xe4\xb8\x96\xe7\x95\x8c\xef\xbc\x81', '"', ')'
\end{lstlisting}
\subsection{Model Quality Metrics}
For a class $c$, test results of documents could be regrouped into 4 categories, we mark $\hat{y_i}$ as ground truth class label, $y_i$ as predicted label:
\begin{itemize}
\item True Positive (TP): when $\hat{y_i} = l$ and $y_i = l$, \emph{i.e.} document written in $l$ is recognised as the same language.
\item False Positive (FP): when $\hat{y_i} \neq l$ and $y_i = l$, \emph{i.e.} document not written in languag $l$ is incorrectly recognised as $l$.
\item True Negative (TN): when $\hat{y_i} \neq l$ and $y_i \neq l$, \emph{i.e.} document not written in $l$ is rejected by $l$.
\item False Negative (FN): when $\hat{y_i} = l$ and $y_i \neq l$, \emph{i.e.} document written in $l$ is incorrectly rejected by $l$.
\end{itemize}
In the context of classification, the quality of methods is measured by Precision, Recall and $F_1$ score.
Recall is also called True Positive Rate (TPR). It is the fraction of correctly classified samples over all samples should be predicted as in $c$:
\[\text{recall} = \frac{\text{\#TP}}{\text{\#TP}+\text{\#FN}}\]
Precision is also called Positive Predictive Value (PPV). It is the fraction of correctly classified samples over all samples predicted as in $c$:
\[\text{precision} = \frac{\text{\#TP}}{\text{\#TP}+\text{\#FP}}\]
The harmonic mean of precision and recall is called $F_1$ score, introduced for balancing two metrics:
\[
F_1 = \left(\frac{\text{precision}^{-1} + \text{recall}^{-1}}{2}\right)^{-1} = 2\cdot\frac{\text{precision}\cdot\text{recall}}{\text{precision}+\text{recall}}
\]
In following subsections, we use $F_1$ as the measurement of the model quality of each class' performance.
Global model quality is evaluated by accuracy score:
\[
\text{accuracy}(y,\hat{y}) = \frac{1}{n}\sum_{i=0}^{n-1}1(y_i = \hat{y}_i)
\]
where $y$ is the predicted labels, $\hat{y}$ is the ground truth labels, $n$ is the number of samples, $1(\cdot)$ is the indicator function. The score shows the ratio of the number of samples whose projected label is the same as its ground truth to the total number of samples.
\subsection{Experimental Results}
\subsubsection{Quality of Models}
The evaluation of the quality of models utilises the entire list of 323 languages.
\paragraph{Overall Quality}
-Table~\ref{tab:total-comp} shows that baseline method reaches only 46.14\% of accuracy. Byte-level ConvNet marks the best accuracy at 87.26\% which is much higher than word-level ConvNet. Both MNB and $n$-gram model reach acceptable results respectively at 85.10\% and 83.39\%.
+Table~\ref{tab:total-comp} shows that baseline method reaches only ??.??\% of overall accuracy. Byte-level ConvNet marks the best accuracy at ??.??\% which is much higher than word-level ConvNet. Both MNB and $n$-gram model reach acceptable results respectively at ??.??\% and ??.??\%.
\begin{table}[t]
\centering
\begin{tabular}{|c|c|}
\hline
& Accuracy / \% \\ \hline
- Baseline & 46.14 \\
- MNB & 85.10 \\
- $n$-gram model & 83.39 \\
- Word-level ConvNet & 76.77 \\
- Byte-level ConvNet & 87.26 \\ \hline
+ Baseline & ??.?? \\
+ MNB & ??.?? \\
+ $n$-gram model & ??.?? \\
+ Word-level ConvNet & ??.?? \\
+ Byte-level ConvNet & ??.?? \\ \hline
\end{tabular}
\caption{\label{tab:total-comp} Comparison of accuracy between evaluation methods.}
\end{table}
-\paragraph{Inequality Between Classes} Although the overall score of Byte-level ConvNet reaches 87.26\%, $F_1$ score of several classes is much lower than the average. For instance, $F_1$ of NetLogo reaches 99.9\%, meanwhile C++ achieves only 47.8\%. Figure~\ref{fig:ineq} illustrates huge gap between best and worst results.
+\paragraph{Inequality Between Classes} Although the overall score of Byte-level ConvNet reaches ??.??\%, $F_1$ score of several classes is much lower than the average. For instance, $F_1$ of NetLogo reaches ??.??\%, meanwhile C++ achieves only ??.??\%. Figure~\ref{fig:ineq} illustrates huge gap between best and worst results.
\begin{figure}[t!]
\centering
- \subfloat[][25 language with highest $F_1$]{
- \includegraphics[height=0.4\textwidth]{./comparison_cnn_f1_above}
- }
- \subfloat[][25 language with least $F_1$]{
- \includegraphics[height=0.4\textwidth]{./comparison_cnn_f1_below}
- }
+% \subfloat[][25 language with highest $F_1$]{
+% \includegraphics[height=0.4\textwidth]{./comparison_cnn_f1_above}
+% }
+% \subfloat[][25 language with least $F_1$]{
+% \includegraphics[height=0.4\textwidth]{./comparison_cnn_f1_below}
+% }
\caption{\label{fig:ineq} Inequality between the most performing classes and least performing classes.}
\end{figure}
\paragraph{Interclass Confusion}
Some languages are especially difficult to distinguish from each other for these methods. We visualise the confusion matrices of methods in our repository in order to give several intuitive observations.
There are significant confusions between similar languages, \emph{i.e.} C and C++; Objective-C, Objective-C++ and Objective-J; Befunge and HyPhy; Java and Processing; NetLinx, PAWN and Ruby; Javascript and Cycript, \emph{etc}.
\subsubsection{Benchmark and Model Sizes}
Table~\ref{tab:ben-train} shows that the first three basic NLP methods could be rapidly trained on CPU even when a large number of classes are considered. ConvNet methods demand more computing power in training stage. On the contrary, ConvNets classify a document over 10 times faster than other $n$-gram based approaches.
\begin{table}[t]
\centering
\begin{tabular}{|c|c|c|c|}
\hline
& \multirow{2}{*}{Training Time} & Test Time & \multirow{2}{*}{Model Size}\\
& & (per file)&\\
\hline
Baseline & 1.8 h & 0.12 s & 3.8 MiB \\
MNB & 0.7 h & 2 s & 323.0 MiB \\
$n$-gram model & 0.8 h & 1.2 s & 663.1 MiB \\
\multirow{2}{*}{Word-level ConvNet} & 40.6 h & \multirow{2}{*}{0.01 s} & \multirow{2}{*}{313.3 MiB}\\
& (18.2 h*) & & \\
\multirow{2}{*}{Byte-level ConvNet} & 20.8 h & \multirow{2}{*}{0.01 s} & \multirow{2}{*}{32.8 MiB} \\
& (1.6 h*) & & \\
\hline
\end{tabular}
\footnotesize{*: Training time on distant VM using GPU.}
\caption{\label{tab:ben-train} Comparison of training time and test time benchmark on the same computer with model size.}
\end{table}
\subsubsection{Filename Extension Is Important}
We know empirically that filename extension is a critical feature of classification. However, we hope to find out how important it is. Knowing that ConvNet is good at highlighting features that distinguish mostly the inputs, we test the performance using Byte-level ConvNet by adding the extension of the file to the input.
For convenience, we test only for 20 languages in the list. Table~\ref{tab:ext} shows that by adding the extension into the code the detection accuracy could be dramatically improved.
\begin{table}[t]
\centering
\begin{tabular}{|c|c|}
\hline
& Accuracy / \%\\
\hline
Without Extension & 93.70 \\
With Extension & \textbf{97.37} \\
\hline
\end{tabular}
\caption{\label{tab:ext} Comparison of accuracy with extension and accuracy without extension with Byte-level ConvNet Classification on 20 classes.}
\end{table}
-\subsubsection{Word or Byte (Ongoing)}
+\subsubsection{Word or Byte}
-Our choice of applying tested methods at byte-level is comparative to the word-level applications. Table~\ref{tab:w-b} indicates that methods perform comparably better for MNB and ConvNet, $n$-gram model drops slightly after switched to byte-level.
+Our choice of applying tested methods at byte-level is competitive with the word-level applications. We each family of methods at these two levels on \texttt{mini} training and test sets. Table~\ref{tab:w-b} indicates that byte-level methods perform better for MNB and ConvNet, $n$-gram model drops slightly after switched to byte-level.
\begin{table}[h]
\centering
\begin{tabular}{|c|c|c|}
\hline
&\multicolumn{2}{c|}{Accuracy / \%} \\
\cline{2-3}
& Word & Byte \\
\hline
MNB & 79.95 & 87.81 \\
$n$-gram model & 92.46 & 91.40 \\
ConvNet & 86.71 & 93.70 \\
\hline
\end{tabular}
-\caption{\label{tab:w-b}}
+\caption{\label{tab:w-b} Comparison of accuracy between word-level and byte-level application of tested methods on \texttt{mini} test set.}
\end{table}
-\section{Application in \emph{Software Heritage} (Ongoing)}
+\section{Application in \emph{Software Heritage}}
We apply Byte-level ConvNet, the most performing method, to a subset of \emph{Software Heritage} archive, containing more than 17 millions files (around 0.1\% of the archive). However, we are not able to check the results one by one. Several folders are therefore selected for evaluation.
\subsection{Manual Verification Results}
Since we have nothing but the content of each file to judge its language in this dataset, the following results are based on author's acquired knowledge with the help of searching engine and other assistant tools. The results here are indicative.
-Table~\ref{tab:manual} indicates the test accuracy of more than a thousand files manually checked by author using a graphic interface. By analysing the tested samples, errors could normally categorised into the following cases:
+Table~\ref{tab:manual} indicates the test accuracy of more than a thousand files manually checked by author using a graphic interface. After the analysis of tested samples, errors could normally categorised into the following cases:
\begin{itemize}
\item Short files. These files containing short code snippet are even indistinguishable for human.
\item Non-text files. Documentations consist usually of PDF documents, PNG or JPEG photos are surely misclassified.
- \item ConvNet does not work well for many popular languages. From the results of former section, popular languages, such as C, C++, HTML, are more often wrongly classified.
+ \item ConvNet does not work well for many popular languages. From the results of former section, massively appearing languages, such as C, C++, HTML, are more often wrongly classified.
\end{itemize}
\begin{table}[t]
\centering
\begin{tabular}{|c|c|}
\hline
& Accuracy / \% \\
\hline
-Subset 1 & 69.53 \\
-Subset 2 & 66.67 \\
-Subset 3 & 62.35 \\
-Subset 4 & 68.28 \\
-Subset 5 & 58.97 \\
+Subset 1 & 72.53 \\
+Subset 2 & 67.05 \\
+Subset 3 & 59.11 \\
+Subset 4 & 67.54 \\
+Subset 5 & 63.45 \\
\hline
-Overall & 64.98 \\
+Overall & 66.05 \\
\hline
\end{tabular}
\caption{\label{tab:manual} Test results of manual checking on subsets of the archive.}
\end{table}
-\subsection{Recourse (Ongoing)}
+\paragraph{Recourse}
Libmagic is an efficient library differentiating plain text files and other formatted files. It is also reliable while recognising the popular languages. We decide to abandon some particular classes which is often misclassified by ConvNet and covered by Libmagic at the same time.
-\section{Challenges of Large-scale Deployment (Ongoing)}
+\section{Challenges of Large-scale Deployment}
-\subsection{Imbalance Between Classes}
+\subsection{Imbalance Between Languages}
Imbalance in dataset between classes could affect the performance of different models in many ways. For the approaches essentially based on statistics, \emph{i.e.} $n$-gram frequency, $n$-gram model, a small training set means that it is possible that we could not fetch enough features. For ConvNet approaches, apart from the former reason, ConvNets intend to ignore smaller classes to avoid errors.
-Despite of our efforts on balancing the number of repositories for each class, a significant imbalance is eventually observed between language classes. We know from Figure~\ref{fig:distribution} that the first half of dataset consists of 13 languages, 310 other languages share another half. Nearly a half of languages possess less than 5,000 files, and two third of these own less than 1,000 files.
+Despite of our efforts on balancing the number of repositories for each class, a significant imbalance is eventually observed between language classes. We know from Figure~\ref{fig:distribution} that the first half of dataset consists of 13 languages, 310 other languages share the another half. Nearly a half of languages possess less than 5,000 files, and two third of these own less than 1,000 files.
In future works, we are willing to fetch a more balanced database for each language and enrich weaker classes during the real deployment on the archive.
-\subsection{Discovering New Languages}
+\subsection{Recognising New Languages}
The real challenges come from changes over time. In order to recognise as many languages as possible, our language list should be able to grow through the time. Unfortunately, the existing performing methods fix \emph{a priori} a list of languages and focus on distinguish between them.
On the one hand, despite our efforts for fetching as many languages as possible, it is already impossible to list all existing languages. On the other hand, we have no idea about how many new languages will appear in the archive.
Therefore, in this subsection, we will note several attempts on discovering new classes and discuss the extensibility of models in following parts.
-\subsubsection{Detecting New Languages}
+\subsubsection{Discovering New Languages}
Unsupervised Learning is the machine learning task finding a model indicating inherent structures of unlabelled data. Clustering is one of the topic on finding potential new self-forming classes in feature space. Since new languages are still unknown for us, we focus here on hierarchical clustering, which does not demand \emph{a priori} a fixed number of new classes.
\paragraph{Agglomerative Hierarchical Clustering}
Agglomerative Hierarchical Clustering (AHC) is the mostly considered Hierarchical Clustering approach. It is a type of bottom-top approach.
We call the sample without label an \emph{observation}. Given $n$ observations $\{o_1,o_2,...,o_n\}$, a distance is calculated with a \emph{pairwise metric} for each pair of documents, resulting $O(n^2)$ distances. At the first time, every single observation is a cluster. By applying a \emph{linkage criteria}, two of clusters are combined as a single cluster. The algorithm terminate when there is only one cluster containing $n$ observations for gathering.
The clustering is tested firstly on the most popular 20 languages. Unfortunately, it does not work as we expected. By varying pairwise metric and linkage criteria, we obtained a slightly more performing combination: euclidean distance and average linkage.
However, Figure~\ref{fig:???} shows that only few languages such as Objective-C are able to form a huge, visible and pure agglomeration of the same language. Most of them are equally mixed up inside a cluster. We will continually discover other methods other than AHC for this task in the future.
-\subsubsection{Extensibility of Existing Models (Ongoing)}
+\subsubsection{Extensibility of Existing Models}
Once discovered, new classes need to be integrated into the existing model. Since the Baseline method, $n$-gram model and MNB demand a profile stocking statistics for each language, it suffices to train the incoming supplementary training sets and simply add the profiles into the model. On the contrary, ConvNet approaches should be retrained with a new network. However, no matter how we integrate these classes into original models, the quality of models will drop when more classes are added.
\paragraph{Impact of Retraining with More Classes}
The objective of \emph{Software Heritage} is to recognise as many languages as possible. Therefore it is inevitable to integrate new languages to older classifier. We test 3 series of training and test sets in order to discover the impact of number of classes on global results and the deterioration of $F_1$ for commonly appeared languages.
\begin{table}[h]
\centering
\begin{tabular}{|c|c|c|c|}
\hline
&\multicolumn{3}{c|}{Accuracy / \%} \\
\cline{2-4}
& \texttt{mini} & \texttt{less} & \texttt{total} \\
& \footnotesize{(20 languages)} & \footnotesize{(109 languages)} & \footnotesize{(323 languages)}\\
\hline
Baseline & 63.03 & 50.09 & 46.14 \\
MNB & 87.81 & 85.34 & 85.10 \\
$n$-gram model & 91.40 & 86.36 & 83.39 \\
Word-level ConvNet & 86.71 & 76.88 & 76.77\\
Byte-level ConvNet & 93.70 & 90.87 & 89.77 \\
\hline
\end{tabular}
\caption{\label{tab:size} Comparison of accuracy score for each method on 3 series of training and test sets.}
\end{table}
Table~\ref{tab:size} compares the global accuracy scores of each series and each approach. We figure out that along with the growth of number of classes, the accuracy drops for all methods. From 20 languages to 323 languages, the baseline method loses 16.89\%, while MNB loses only 2.71\%.
Figure~\ref{fig:size} shows that the recognition quality of earlier integrated languages drops on most occasions, especially for those languages which are often the root of later introduced languages.
\begin{figure}[t!]
\centering
- \subfloat[Baseline]{
- \includegraphics[height=8cm]{./comparison_ngram_dist_size.pdf}
- }
- \subfloat[MNB]{
- \includegraphics[height=8cm]{./comparison_bayes_size.pdf}
- }
- \subfloat[$n$-gram]{
- \includegraphics[height=8cm]{./comparison_ngram_prob_size.pdf}
- }
- \subfloat[Word ConvNet]{
- \includegraphics[height=8cm]{./comparison_cnn_word_size.pdf}
- }
- \subfloat[Byte ConvNet]{
- \includegraphics[height=8cm]{./comparison_cnn_size.pdf}
- }
+% \subfloat[Baseline]{
+% \includegraphics[height=8cm]{./comparison_ngram_dist_size.pdf}
+% }
+% \subfloat[MNB]{
+% \includegraphics[height=8cm]{./comparison_bayes_size.pdf}
+% }
+% \subfloat[$n$-gram]{
+% \includegraphics[height=8cm]{./comparison_ngram_prob_size.pdf}
+% }
+% \subfloat[Word ConvNet]{
+% \includegraphics[height=8cm]{./comparison_cnn_word_size.pdf}
+% }
+% \subfloat[Byte ConvNet]{
+% \includegraphics[height=8cm]{./comparison_cnn_size.pdf}
+% }
\caption{\label{fig:size} Comparison of $F_1$ score for each method on 3 series of training and test sets. (Blue: \texttt{mini}, Red: \texttt{less}, Cyan: \texttt{total})}
\end{figure}
-\subsubsection{Incremental Learning (Ongoing)}
+\subsubsection{Incremental Learning}
Incremental learning is a another track of supervised learning, which is capable to take new classes in account when they appear in the training data flow. This online procedure means that earlier learned knowledge is conserved, reused and enriched, which is different from the offline retraining by completely forgetting the ancient knowledge. Nowadays, there exists several deep incremental learning models, \emph{e.g.} Gepperth's GeppNet\cite{Gepperth16}, Rebuffi \emph{et al.}'s iCaRL\cite{RebuffiKL16}, Kemker and Kanan's FearNet\cite{Kemker17}, \emph{etc.}
Although the online version learning is favourable for use cases of \emph{Software Heritage}, the performance shown in \cite{Kemker17} points out that it is inevitable that the overall accuracy of these incremental learning method degrades after adding new classes. In addition, the online learning is shown in \cite{Kemker17} always underperforming to offline version.
-\subsubsection{Other Solutions (Ongoing)}
-
-
+\section{Conclusion}
-\section{Conclusion (Ongoing)}
-
-In the frame of TRE, we investigated existing NPL methods of text categorisation for applying to source code in Software Heritage. We tested on an originally created dataset with 374 classes.
+In the frame of TRE, we investigated existing NPL methods of text categorisation for applying to source code in \emph{Software Heritage}. A dataset covering 374 language classes was originally created for tested machine learning methods. We compared several mature NLP methods for a large scale and proposed the most performing one, byte-level ConvNet method, in the possible large-scale deployment in archive. Although the performance is not convincing for official deployment, we traced a potential roadmap for future extension of models.
\clearpage
\begin{appendices}
\section{Language List}
\begin{table*}[h!]
\centering
\tiny
\begin{tabularx}{\textwidth}{|X|X|X|X|X|X|}
\hline
-1C Enterprise & ABAP & ActionScript & Ada & Agda & AGS Script \\
-\hline
-Alloy & AMPL & AngelScript & ANTLR & Apex & API Blueprint \\
-\hline
-APL & AppleScript & Arc & ASP & AspectJ & Assembly \\
-\hline
-ATS & Augeas & AutoHotkey & AutoIt & Awk & Ballerina \\
-\hline
-Batchfile & Befunge & BitBake & BlitzBasic & BlitzMax & Bluespec \\
-\hline
-Boo & Brainfuck & Brightscript & Bro & C & C\# \\
-\hline
-C++ & Cap'n Proto & CartoCSS & Ceylon & Chapel & Charity \\
-\hline
-ChucK & Cirru & Clarion & Clean & Click & CLIPS \\
-\hline
-Clojure & CMake & COBOL & CoffeeScript & ColdFusion & Common Lisp \\
-\hline
-Common Workflow Language & Component Pascal & Cool & Coq & Crystal & Csound \\
-\hline
-Csound Document & Csound Score & CSS & Cuda & CWeb & Cycript \\
-\hline
-D & Dart & DataWeave & DIGITAL Command Language & DM & Dogescript \\
-\hline
-DTrace & Dylan & E & eC & ECL & Eiffel \\
-\hline
-Elixir & Elm & Emacs Lisp & EmberScript & EQ & Erlang \\
-\hline
-F\# & Factor & Fancy & Fantom & Filebench WML & FLUX \\
-\hline
-Forth & Fortran & FreeMarker & Frege & Game Maker Language & GAMS \\
-\hline
-GAP & GDB & GDScript & Genie & Genshi & Gherkin \\
-\hline
-GLSL & Glyph & Gnuplot & Go & Golo & Gosu \\
-\hline
-Grace & Grammatical Framework & Groovy & Hack & Harbour & Haskell \\
-\hline
-Haxe & HCL & HLSL & HTML & Hy & HyPhy \\
-\hline
-IDL & Idris & IGOR Pro & Inform 7 & Inno Setup & Io \\
-\hline
-Ioke & Isabelle & J & Jasmin & Java & JavaScript \\
-\hline
-Jolie & JSONiq & Julia & Jupyter Notebook & Kit & Kotlin \\
-\hline
-KRL & LabVIEW & Lasso & Lean & Lex & LFE \\
-\hline
-LilyPond & Limbo & Liquid & LiveScript & LLVM & Logos \\
-\hline
-Logtalk & LOLCODE & LookML & LoomScript & LSL & Lua \\
-\hline
-M & M4 & Makefile & Mako & Markdown & Mask \\
-\hline
-Mathematica & Matlab & Max & MAXScript & Mercury & Meson \\
-\hline
-Metal & Mirah & Modelica & Modula-2 & Module Management System & Monkey \\
-\hline
-Moocode & MoonScript & MQL4 & MQL5 & MTML & mupad \\
-\hline
-NCL & Nearley & Nemerle & nesC & NetLinx & NetLinx+ERB \\
-\hline
-NetLogo & NewLisp & Nextflow & Nim & Nit & Nix \\
-\hline
-NSIS & Nu & Objective-C & Objective-C++ & Objective-J & OCaml \\
-\hline
-Omgrofl & ooc & Opa & Opal & OpenEdge ABL & OpenSCAD \\
-\hline
-Ox & Oxygene & Oz & P4 & Pan & Papyrus \\
-\hline
-Parrot & Pascal & PAWN & Pep8 & Perl & Perl 6 \\
-\hline
-PHP & PicoLisp & PigLatin & Pike & PLpgSQL & PLSQL \\
-\hline
-PogoScript & Pony & PostScript & POV-Ray SDL & PowerBuilder & PowerShell \\
-\hline
-Processing & Prolog & Propeller Spin & Puppet & PureBasic & PureScript \\
-\hline
-Python & QMake & QML & R & Racket & Ragel \\
-\hline
-RAML & Rascal & REALbasic & Rebol & Red & Redcode \\
-\hline
-Ren'Py & RenderScript & reStructuredText & REXX & Ring & RMarkdown \\
-\hline
-RobotFramework & Roff & Rouge & RPC & Ruby & Rust \\
-\hline
-SaltStack & SAS & Scala & Scheme & Scilab & Self \\
-\hline
-ShaderLab & Shell & ShellSession & Shen & Slash & Smali \\
-\hline
-Smalltalk & Smarty & SMT & Solidity & SourcePawn & SQF \\
-\hline
-SQLPL & Squirrel & SRecode Template & Stan & Standard ML & Stata \\
-\hline
-SuperCollider & Swift & SystemVerilog & Tcl & Tea & Terra \\
-\hline
-TeX & Thrift & TI Program & TLA & Turing & TXL \\
-\hline
-TypeScript & Uno & UnrealScript & UrWeb & Vala & VCL \\
-\hline
-Verilog & VHDL & Vim script & Visual Basic & Volt & Vue \\
-\hline
-wdl & WebAssembly & WebIDL & wisp & X10 & xBase \\
-\hline
-XC & XML & Xojo & XProc & XQuery & XS \\
-\hline
- XSLT & Xtend & Yacc & YAML & Zephir & Zimpl \\
-\hline
+1C Enterprise & ABAP & ActionScript & Ada & Adobe Font Metrics & Agda\\ \hline
+AGS Script & Alloy & AMPL & AngelScript & Ant Build System & ANTLR\\ \hline
+ApacheConf & Apex & API Blueprint & APL & AppleScript & Arc\\ \hline
+AsciiDoc & ASP & AspectJ & Assembly & ATS & Augeas\\ \hline
+AutoHotkey & AutoIt & Awk & Ballerina & Batchfile & BitBake\\ \hline
+BlitzBasic & BlitzMax & Bluespec & Boo & Brainfuck & Brightscript\\ \hline
+Bro & C & C\# & C++ & Cap'n Proto & CartoCSS\\ \hline
+Ceylon & Chapel & Charity & ChucK & Cirru & Clarion\\ \hline
+Clean & Click & CLIPS & Clojure & CMake & COBOL\\ \hline
+CoffeeScript & ColdFusion & COLLADA & Common Lisp & Common Workflow Language & Component Pascal\\ \hline
+CoNLL-U & Cool & Coq & Crystal & Csound & Csound Document\\ \hline
+Csound Score & CSS & CSV & Cuda & CWeb & Cycript\\ \hline
+D & Dart & DataWeave & desktop & Diff & DIGITAL Command Language\\ \hline
+DM & Dockerfile & Dogescript & DTrace & Dylan & E\\ \hline
+Eagle & eC & ECL & edn & Eiffel & Elixir\\ \hline
+Elm & Emacs Lisp & EmberScript & EQ & Erlang & F\#\\ \hline
+Factor & Fancy & Fantom & Filebench WML & FLUX & Forth\\ \hline
+Fortran & FreeMarker & Frege & G-code & Game Maker Language & GAMS\\ \hline
+GAP & GDB & GDScript & Genie & Genshi & Gerber Image\\ \hline
+Gettext Catalog & Gherkin & GLSL & Glyph & Gnuplot & Go\\ \hline
+Golo & Gosu & Grace & Gradle & Grammatical Framework & Graph Modeling Language\\ \hline
+GraphQL & Graphviz (DOT) & Groovy & Hack & Harbour & Haskell\\ \hline
+Haxe & HCL & HLSL & HTML & HXML & Hy\\ \hline
+HyPhy & IDL & Idris & IGOR Pro & Inform 7 & INI\\ \hline
+Inno Setup & Io & Ioke & Isabelle & J & Jasmin\\ \hline
+Java & JavaScript & Jolie & JSON & JSON5 & JSONiq\\ \hline
+Julia & Jupyter Notebook & KiCad Layout & KiCad Legacy Layout & KiCad Schematic & Kit\\ \hline
+Kotlin & KRL & LabVIEW & Lasso & Lean & Lex\\ \hline
+LFE & LilyPond & Limbo & Linker Script & Linux Kernel Module & Liquid\\ \hline
+LiveScript & LLVM & Logos & Logtalk & LOLCODE & LookML\\ \hline
+LoomScript & LSL & Lua & M & M4 & Makefile\\ \hline
+Mako & Markdown & Mask & Mathematica & Matlab & Maven POM\\ \hline
+Max & MAXScript & MediaWiki & Mercury & Meson & Metal\\ \hline
+Mirah & Modelica & Modula-2 & Module Management System & Monkey & Moocode\\ \hline
+MoonScript & MQL4 & MQL5 & MTML & mupad & NCL\\ \hline
+Nemerle & nesC & NetLinx & NetLogo & NewLisp & Nextflow\\ \hline
+Nginx & Nim & Nit & Nix & NSIS & Nu\\ \hline
+Objective-C & Objective-C++ & Objective-J & OCaml & ooc & Opa\\ \hline
+OpenEdge ABL & OpenSCAD & OpenType Feature File & Org & Ox & Oz\\ \hline
+P4 & Pan & Papyrus & Parrot & Pascal & PAWN\\ \hline
+Pep8 & Perl & Perl 6 & PHP & Pickle & PicoLisp\\ \hline
+PigLatin & Pike & PLpgSQL & PLSQL & Pod & PogoScript\\ \hline
+Pony & PostScript & POV-Ray SDL & PowerBuilder & PowerShell & Processing\\ \hline
+Prolog & Propeller Spin & Protocol Buffer & Public Key & Puppet & Pure Data\\ \hline
+PureBasic & PureScript & Python & q & QMake & QML\\ \hline
+R & Racket & Ragel & RAML & Rascal & Raw token data\\ \hline
+RDoc & REALbasic & Rebol & Red & Redcode & Ren'Py\\ \hline
+RenderScript & reStructuredText & REXX & Ring & RMarkdown & RobotFramework\\ \hline
+Roff & Rouge & RPC & RPM Spec & Ruby & Rust\\ \hline
+SaltStack & SAS & Scala & Scheme & Scilab & sed\\ \hline
+Self & ShaderLab & Shell & Shen & Slash & Smali\\ \hline
+Smalltalk & Smarty & SMT & Solidity & SourcePawn & SPARQL\\ \hline
+SQF & SQL & SQLPL & Squirrel & SRecode Template & Stan\\ \hline
+Standard ML & Stata & SubRip Text & SuperCollider & SVG & Swift\\ \hline
+SystemVerilog & Tcl & Tea & Terra & TeX & Text\\ \hline
+Textile & Thrift & TI Program & TLA & TOML & Turing\\ \hline
+Turtle & TXL & TypeScript & Unity3D Asset & Uno & UnrealScript\\ \hline
+UrWeb & Vala & VCL & Verilog & VHDL & Vim script\\ \hline
+Visual Basic & Volt & Vue & Wavefront Material & Wavefront Object & wdl\\ \hline
+Web Ontology Language & WebAssembly & WebIDL & wisp & X10 & xBase\\ \hline
+XC & XML & Xojo & XPages & XProc & XQuery\\ \hline
+XS & XSLT & Xtend & Yacc & YAML & YANG\\ \hline
+Zephir & Zimpl \\ \cline{1-2}
\end{tabularx}
-\caption{\label{tab:lan} Language List of the dataset, 323 languages engaged.}
+\caption{\label{tab:lan} Language List of the dataset, 374 languages engaged.}
\end{table*}
\section{File Distribution in Dataset of Each Language}
\begin{figure}[h]
\centering
\includegraphics[width=\textwidth]{circle}
\caption{\label{fig:distribution}File Distribution in Dataset of Each Language}
\end{figure}
\section{Hyperparameters of ConvNets}
\begin{table}[h!]
\centering
\begin{tabular}{|c|c|}
\hline
Hyperparameter & Value \\
\hline
-input size & 2,048 \\
-vocabulary size & 256 \\
-character embedding size & 32 \\
-filter sizes & [3, 5, 7, 9, 10] \\
-nb. of filter matrices & 256 \\
+input size & 400 \\
+vocabulary size & 15000 \\
+character embedding size & 128 \\
+filter sizes & [3, 4, 5] \\
+nb. of filter matrices & 100 \\
+dropout rate & 0.5 \\
activation function & ReLU \\
-nb. of neurons in fully connected level & 1,024 \\
-nb. of classes & 323 \\
+nb. of neurons in fully connected level & 1024 \\
+nb. of classes & 374 \\
\hline
\end{tabular}
-\caption{\label{tab:hyp-byte} Details of hyperparameter configuration of byte-level ConvNet architecture, referred from Chaitanya Joshi's adaptation.}
+\caption{\label{tab:hyp-word} Details of hyperparameter configuration of word-level ConvNet architecture, referred from \cite{Kim14}.}
\end{table}
\begin{table}[h!]
\centering
\begin{tabular}{|c|c|}
\hline
Hyperparameter & Value \\
\hline
-input size & 400 \\
-vocabulary size & 15000 \\
-character embedding size & 128 \\
-filter sizes & [3, 4, 5] \\
-nb. of filter matrices & 100 \\
-dropout rate & 0.5 \\
+input size & 4,096 \\
+vocabulary size & 256 \\
+character embedding size & 32 \\
+filter sizes & [3, 5, 7, 9, 10] \\
+nb. of filter matrices & 256 \\
activation function & ReLU \\
-nb. of neurons in fully connected level & 1024 \\
-nb. of classes & 323 \\
+nb. of neurons in fully connected level & 1,024 \\
+nb. of classes & 374 \\
\hline
\end{tabular}
-\caption{\label{tab:hyp-word} Details of hyperparameter configuration of word-level ConvNet architecture, referred from \cite{Kim14}.}
+\caption{\label{tab:hyp-byte} Details of hyperparameter configuration of byte-level ConvNet architecture, referred from Chaitanya Joshi's adaptation.}
\end{table}
\end{appendices}
\bibliography{bib-rapport}
\bibliographystyle{unsrt}
%Rapport
%
%Il doit faire de 15 à 30 pages et, dans la mesure du possible, doit être en grande part lisible par des non-spécialistes. Son plan peut être par exemple :
%présentation du domaine de recherche (le jury n'est pas constitué seulement de spécialistes du domaine, tenez-en compte !) ;
%énoncé et motivation du sujet ;
%résultats existants s'y rapportant (état de l'art, commentaire d'article, ...) ;
%vos résultats personnels (clairement identifiés comme tels).
%Le rapport devra être assorti d'un résumé d'une page compréhensible par quiconque.
\end{document}
\ No newline at end of file
diff --git a/swh/langdetect/checker.py b/swh/langdetect/checker.py
index 9f873c1..663a4c3 100644
--- a/swh/langdetect/checker.py
+++ b/swh/langdetect/checker.py
@@ -1,160 +1,160 @@
#!/usr/bin/python
from PyQt5 import QtGui, QtCore
from pyforms import BaseWidget
from pyforms.controls import ControlTextArea
from pyforms.controls import ControlDir
from pyforms.controls import ControlList
from pyforms.controls import ControlLabel
from pyforms.controls import ControlCombo
from .cnn import CNN
import pyforms, os, gzip
from pickle import load, dump
RED = QtGui.QColor(255,0,0)
WHITE = QtGui.QColor(255,255,255)
GREEN = QtGui.QColor(0,255,0)
BLACK = QtGui.QColor(0,0,0)
class Check(BaseWidget):
def __init__(self):
super(Check, self).__init__('Software Heritage Source Code Language Manual Check Tool')
self._control_root = ControlDir('Choose the root of database: ')
self._control = ControlDir('Choose a directory: ')
self._list = ControlList('Files in the directory')
self._text = ControlTextArea('Content')
self._label = ControlLabel('Language: \nValue: ')
self._label_rest = ControlLabel('')
self._combo = ControlCombo('Is that correct ?')
self.formset = ['_control_root', '_control', ('_list', ['_text', ('_label', '_combo')]),'_label_rest']
self._control_root.changed_event = self.__save_root
self._control.changed_event = self.__get_files
self._list.readonly = True
self._list.item_selection_changed_event=self.__show_text
self._text.readonly = True
- self._cnn = CNN(None, 2048, None)
+ self._cnn = CNN(None, 4096, None)
self._root = None
self._dict = {}
self._combo += ('Unknown', None)
self._combo += ('No', False)
self._combo += ('Yes', True)
self._combo.activated_event = self.__checked
self._curr_row = None
self._curr_column = None
self._curr_dir = None
self._state = 0
self.before_close_event = self.__store
def __save_root(self):
self._root = self._control_root.value
try:
with open(os.path.join(self._root, 'results'), 'rb') as f:
self._dicts = load(f)
except Exception:
self._dicts = {}
self._state = 1
self._control.value = self._root
def __store(self):
with open(os.path.join(self._root, 'results'), 'wb') as f:
self._dicts[self._curr_dir] = self._dict
dump(self._dicts, f)
def __get_files(self):
if self._state == 1:
self._state = 2
return
elif self._state == 0:
self.alert_popup('Please choose root of your database.', title='Error')
return
res = []
if self._curr_dir != None:
self._dicts[self._curr_dir] = self._dict
self._curr_dir = self._control.value
self._dict = self._dicts.get(self._curr_dir, {})
for root, sub, files in os.walk(self._curr_dir):
if sub == []:
for file in files:
if not file.startswith('.'):
res.append((os.path.join(root, file),))
self._list.value = res
self._update_status()
self._update_cells_color()
def __checked(self, index):
path = self._list.get_value(self._curr_column, self._curr_row)
self._dict[path] = self._combo.value
if self._combo.value == 'Unknown':
del self._dict[path]
self._update_color(self._combo.value, self._list.get_cell(self._curr_column, self._curr_row))
self._update_status()
def _update_status(self):
correct = len([x for x in self._dict.values() if x == True])
wrong = len(self._dict.keys()) - correct
remaining = len(self._list.value) - len(self._dict.keys())
try:
accuracy = correct / (correct + wrong) * 100
except:
accuracy = 0
self._label_rest.value = 'Correct:\t{}\tWrong:\t{}\tRemaining:\t{}\tAccuracy:\t{:.2f}%'.format(correct, wrong, remaining, accuracy)
def _update_cells_color(self):
n = self._list.rows_count
for i in range(0, n):
cell = self._list.get_cell(0, i)
value = self._list.get_value(0, i)
self._update_color(self._dict.get(value, None), cell)
def _update_color(self, x, cell):
if x == False:
cell.setBackground(RED)
elif x == True:
cell.setBackground(GREEN)
else:
cell.setBackground(WHITE)
def __show_text(self):
column = 0
row = self._list.selected_row_index
self._curr_row = row
self._curr_column = column
if row == None:
self._text.value = ''
self._label.value = 'Language: \nValue: '
return
path = self._list.get_value(column, row)
with gzip.open(path, 'rb') as f:
string = f.read()
try:
string = string.decode('utf-8')
except UnicodeDecodeError:
pass
self._text.value = string[:10240]
res = self._cnn.classify(path)
if(res[1] >= 0):
self._label.value = 'Language: {}\nValue: {}'.format(res[0],res[1])
else:
self._label.value = 'Language: No Reliable Result\nValue: '
h_sel = self._dict.get(path, None)
if h_sel == None:
self._combo.current_index = 0
elif h_sel == False:
self._combo.current_index = 1
elif h_sel == True:
self._combo.current_index = 2
#Execute the application
if __name__ == "__main__":
pyforms.start_app(Check)
diff --git a/swh/langdetect/static_data/languages.yml b/swh/langdetect/static_data/languages.yml
deleted file mode 100755
index 83abb6a..0000000
--- a/swh/langdetect/static_data/languages.yml
+++ /dev/null
@@ -1,5339 +0,0 @@
-# Defines all Languages known to GitHub.
-#
-# type - Either data, programming, markup, prose, or nil
-# aliases - An Array of additional aliases (implicitly
-# includes name.downcase)
-# ace_mode - A String name of the Ace Mode used for highlighting whenever
-# a file is edited. This must match one of the filenames in http://git.io/3XO_Cg.
-# Use "text" if a mode does not exist.
-# codemirror_mode - A String name of the CodeMirror Mode used for highlighting whenever a file is edited.
-# This must match a mode from https://git.io/vi9Fx
-# wrap - Boolean wrap to enable line wrapping (default: false)
-# extensions - An Array of associated extensions (the first one is
-# considered the primary extension, the others should be
-# listed alphabetically)
-# interpreters - An Array of associated interpreters
-# searchable - Boolean flag to enable searching (defaults to true)
-# language_id - Integer used as a language-name-independent indexed field so that we can rename
-# languages in Linguist without reindexing all the code on GitHub. Must not be
-# changed for existing languages without the explicit permission of GitHub staff.
-# color - CSS hex color to represent the language. Only used if type is "programming" or "prose".
-# tm_scope - The TextMate scope that represents this programming
-# language. This should match one of the scopes listed in
-# the grammars.yml file. Use "none" if there is no grammar
-# for this language.
-# group - Name of the parent language. Languages in a group are counted
-# in the statistics as the parent language.
-#
-# Any additions or modifications (even trivial) should have corresponding
-# test changes in `test/test_blob.rb`.
-#
-# Please keep this list alphabetized. Capitalization comes before lowercase.
-
----
-1C Enterprise:
- type: programming
- color: "#814CCC"
- extensions:
- - ".bsl"
- - ".os"
- tm_scope: source.bsl
- ace_mode: text
- language_id: 0
-ABAP:
- type: programming
- color: "#E8274B"
- extensions:
- - ".abap"
- ace_mode: abap
- language_id: 1
-ABNF:
- type: data
- ace_mode: text
- extensions:
- - ".abnf"
- tm_scope: source.abnf
- language_id: 429
-AGS Script:
- type: programming
- color: "#B9D9FF"
- aliases:
- - ags
- extensions:
- - ".asc"
- - ".ash"
- tm_scope: source.c++
- ace_mode: c_cpp
- codemirror_mode: clike
- codemirror_mime_type: text/x-c++src
- language_id: 2
-AMPL:
- type: programming
- color: "#E6EFBB"
- extensions:
- - ".ampl"
- - ".mod"
- tm_scope: source.ampl
- ace_mode: text
- language_id: 3
-ANTLR:
- type: programming
- color: "#9DC3FF"
- extensions:
- - ".g4"
- ace_mode: text
- language_id: 4
-API Blueprint:
- type: markup
- color: "#2ACCA8"
- ace_mode: markdown
- extensions:
- - ".apib"
- tm_scope: text.html.markdown.source.gfm.apib
- language_id: 5
-APL:
- type: programming
- color: "#5A8164"
- extensions:
- - ".apl"
- - ".dyalog"
- interpreters:
- - apl
- - aplx
- - dyalog
- tm_scope: source.apl
- ace_mode: text
- codemirror_mode: apl
- codemirror_mime_type: text/apl
- language_id: 6
-ASN.1:
- type: data
- extensions:
- - ".asn"
- - ".asn1"
- tm_scope: source.asn
- ace_mode: text
- codemirror_mode: asn.1
- codemirror_mime_type: text/x-ttcn-asn
- language_id: 7
-ASP:
- type: programming
- color: "#6a40fd"
- tm_scope: text.html.asp
- aliases:
- - aspx
- - aspx-vb
- extensions:
- - ".asp"
- - ".asax"
- - ".ascx"
- - ".ashx"
- - ".asmx"
- - ".aspx"
- - ".axd"
- ace_mode: text
- codemirror_mode: htmlembedded
- codemirror_mime_type: application/x-aspx
- language_id: 8
-ATS:
- type: programming
- color: "#1ac620"
- aliases:
- - ats2
- extensions:
- - ".dats"
- - ".hats"
- - ".sats"
- tm_scope: source.ats
- ace_mode: ocaml
- language_id: 9
-ActionScript:
- type: programming
- tm_scope: source.actionscript.3
- color: "#882B0F"
- aliases:
- - actionscript 3
- - actionscript3
- - as3
- extensions:
- - ".as"
- ace_mode: actionscript
- language_id: 10
-Ada:
- type: programming
- color: "#02f88c"
- extensions:
- - ".adb"
- - ".ada"
- - ".ads"
- aliases:
- - ada95
- - ada2005
- ace_mode: ada
- language_id: 11
-Adobe Font Metrics:
- type: data
- tm_scope: source.afm
- extensions:
- - ".afm"
- aliases:
- - acfm
- - adobe composite font metrics
- - adobe multiple font metrics
- - amfm
- ace_mode: text
- language_id: 147198098
-Agda:
- type: programming
- color: "#315665"
- extensions:
- - ".agda"
- ace_mode: text
- language_id: 12
-Alloy:
- type: programming
- color: "#64C800"
- extensions:
- - ".als"
- ace_mode: text
- language_id: 13
-Alpine Abuild:
- type: programming
- group: Shell
- aliases:
- - abuild
- - apkbuild
- filenames:
- - APKBUILD
- tm_scope: source.shell
- ace_mode: sh
- codemirror_mode: shell
- codemirror_mime_type: text/x-sh
- language_id: 14
-AngelScript:
- type: programming
- color: "#C7D7DC"
- extensions:
- - ".as"
- - ".angelscript"
- tm_scope: source.angelscript
- ace_mode: text
- codemirror_mode: clike
- codemirror_mime_type: text/x-c++src
- language_id: 389477596
-Ant Build System:
- type: data
- tm_scope: text.xml.ant
- filenames:
- - ant.xml
- - build.xml
- ace_mode: xml
- codemirror_mode: xml
- codemirror_mime_type: application/xml
- language_id: 15
-ApacheConf:
- type: data
- aliases:
- - aconf
- - apache
- extensions:
- - ".apacheconf"
- - ".vhost"
- tm_scope: source.apache-config
- ace_mode: apache_conf
- language_id: 16
-Apex:
- type: programming
- extensions:
- - ".cls"
- tm_scope: source.java
- ace_mode: java
- codemirror_mode: clike
- codemirror_mime_type: text/x-java
- language_id: 17
-Apollo Guidance Computer:
- type: programming
- group: Assembly
- extensions:
- - ".agc"
- tm_scope: source.agc
- ace_mode: assembly_x86
- language_id: 18
-AppleScript:
- type: programming
- aliases:
- - osascript
- extensions:
- - ".applescript"
- - ".scpt"
- interpreters:
- - osascript
- ace_mode: applescript
- color: "#101F1F"
- language_id: 19
-Arc:
- type: programming
- color: "#aa2afe"
- extensions:
- - ".arc"
- tm_scope: none
- ace_mode: text
- language_id: 20
-AsciiDoc:
- type: prose
- ace_mode: asciidoc
- wrap: true
- extensions:
- - ".asciidoc"
- - ".adoc"
- - ".asc"
- tm_scope: text.html.asciidoc
- language_id: 22
-AspectJ:
- type: programming
- color: "#a957b0"
- extensions:
- - ".aj"
- tm_scope: source.aspectj
- ace_mode: text
- language_id: 23
-Assembly:
- type: programming
- color: "#6E4C13"
- aliases:
- - asm
- - nasm
- extensions:
- - ".asm"
- - ".a51"
- - ".inc"
- - ".nasm"
- tm_scope: source.assembly
- ace_mode: assembly_x86
- language_id: 24
-Augeas:
- type: programming
- extensions:
- - ".aug"
- tm_scope: none
- ace_mode: text
- language_id: 25
-AutoHotkey:
- type: programming
- color: "#6594b9"
- aliases:
- - ahk
- extensions:
- - ".ahk"
- - ".ahkl"
- tm_scope: source.ahk
- ace_mode: autohotkey
- language_id: 26
-AutoIt:
- type: programming
- color: "#1C3552"
- aliases:
- - au3
- - AutoIt3
- - AutoItScript
- extensions:
- - ".au3"
- tm_scope: source.autoit
- ace_mode: autohotkey
- language_id: 27
-Awk:
- type: programming
- extensions:
- - ".awk"
- - ".auk"
- - ".gawk"
- - ".mawk"
- - ".nawk"
- interpreters:
- - awk
- - gawk
- - mawk
- - nawk
- ace_mode: text
- language_id: 28
-Ballerina:
- type: programming
- extensions:
- - ".bal"
- tm_scope: source.ballerina
- ace_mode: text
- color: "#FF5000"
- language_id: 720859680
-Batchfile:
- type: programming
- aliases:
- - bat
- - batch
- - dosbatch
- - winbatch
- extensions:
- - ".bat"
- - ".cmd"
- tm_scope: source.batchfile
- ace_mode: batchfile
- color: "#C1F12E"
- language_id: 29
-Befunge:
- type: programming
- extensions:
- - ".befunge"
- ace_mode: text
- language_id: 30
-Bison:
- type: programming
- group: Yacc
- tm_scope: source.bison
- extensions:
- - ".bison"
- ace_mode: text
- language_id: 31
-BitBake:
- type: programming
- tm_scope: none
- extensions:
- - ".bb"
- ace_mode: text
- language_id: 32
-Blade:
- type: markup
- group: HTML
- extensions:
- - ".blade"
- - ".blade.php"
- tm_scope: text.html.php.blade
- ace_mode: text
- language_id: 33
-BlitzBasic:
- type: programming
- aliases:
- - b3d
- - blitz3d
- - blitzplus
- - bplus
- extensions:
- - ".bb"
- - ".decls"
- tm_scope: source.blitzmax
- ace_mode: text
- language_id: 34
-BlitzMax:
- type: programming
- color: "#cd6400"
- extensions:
- - ".bmx"
- aliases:
- - bmax
- ace_mode: text
- language_id: 35
-Bluespec:
- type: programming
- extensions:
- - ".bsv"
- tm_scope: source.bsv
- ace_mode: verilog
- language_id: 36
-Boo:
- type: programming
- color: "#d4bec1"
- extensions:
- - ".boo"
- ace_mode: text
- tm_scope: source.boo
- language_id: 37
-Brainfuck:
- type: programming
- color: "#2F2530"
- extensions:
- - ".b"
- - ".bf"
- tm_scope: source.bf
- ace_mode: text
- codemirror_mode: brainfuck
- codemirror_mime_type: text/x-brainfuck
- language_id: 38
-Brightscript:
- type: programming
- extensions:
- - ".brs"
- tm_scope: source.brightscript
- ace_mode: text
- language_id: 39
-Bro:
- type: programming
- extensions:
- - ".bro"
- ace_mode: text
- language_id: 40
-C:
- type: programming
- color: "#555555"
- extensions:
- - ".c"
- - ".cats"
- - ".h"
- - ".idc"
- interpreters:
- - tcc
- ace_mode: c_cpp
- codemirror_mode: clike
- codemirror_mime_type: text/x-csrc
- language_id: 41
-C#:
- type: programming
- ace_mode: csharp
- codemirror_mode: clike
- codemirror_mime_type: text/x-csharp
- tm_scope: source.cs
- color: "#178600"
- aliases:
- - csharp
- extensions:
- - ".cs"
- - ".cake"
- - ".cshtml"
- - ".csx"
- language_id: 42
-C++:
- type: programming
- ace_mode: c_cpp
- codemirror_mode: clike
- codemirror_mime_type: text/x-c++src
- color: "#f34b7d"
- aliases:
- - cpp
- extensions:
- - ".cpp"
- - ".c++"
- - ".cc"
- - ".cp"
- - ".cxx"
- - ".h"
- - ".h++"
- - ".hh"
- - ".hpp"
- - ".hxx"
- - ".inc"
- - ".inl"
- - ".ino"
- - ".ipp"
- - ".re"
- - ".tcc"
- - ".tpp"
- language_id: 43
-C-ObjDump:
- type: data
- extensions:
- - ".c-objdump"
- tm_scope: objdump.x86asm
- ace_mode: assembly_x86
- language_id: 44
-C2hs Haskell:
- type: programming
- group: Haskell
- aliases:
- - c2hs
- extensions:
- - ".chs"
- tm_scope: source.haskell
- ace_mode: haskell
- codemirror_mode: haskell
- codemirror_mime_type: text/x-haskell
- language_id: 45
-CLIPS:
- type: programming
- extensions:
- - ".clp"
- tm_scope: source.clips
- ace_mode: text
- language_id: 46
-CMake:
- type: programming
- extensions:
- - ".cmake"
- - ".cmake.in"
- filenames:
- - CMakeLists.txt
- ace_mode: text
- codemirror_mode: cmake
- codemirror_mime_type: text/x-cmake
- language_id: 47
-COBOL:
- type: programming
- extensions:
- - ".cob"
- - ".cbl"
- - ".ccp"
- - ".cobol"
- - ".cpy"
- ace_mode: cobol
- codemirror_mode: cobol
- codemirror_mime_type: text/x-cobol
- language_id: 48
-COLLADA:
- type: data
- extensions:
- - ".dae"
- tm_scope: text.xml
- ace_mode: xml
- codemirror_mode: xml
- codemirror_mime_type: text/xml
- language_id: 49
-CSON:
- type: data
- group: CoffeeScript
- tm_scope: source.coffee
- ace_mode: coffee
- codemirror_mode: coffeescript
- codemirror_mime_type: text/x-coffeescript
- searchable: false
- extensions:
- - ".cson"
- language_id: 424
-CSS:
- type: markup
- tm_scope: source.css
- ace_mode: css
- codemirror_mode: css
- codemirror_mime_type: text/css
- color: "#563d7c"
- extensions:
- - ".css"
- language_id: 50
-CSV:
- type: data
- ace_mode: text
- tm_scope: none
- extensions:
- - ".csv"
- language_id: 51
-CWeb:
- type: programming
- extensions:
- - ".w"
- tm_scope: none
- ace_mode: text
- language_id: 657332628
-Cap'n Proto:
- type: programming
- tm_scope: source.capnp
- extensions:
- - ".capnp"
- ace_mode: text
- language_id: 52
-CartoCSS:
- type: programming
- aliases:
- - Carto
- extensions:
- - ".mss"
- ace_mode: text
- tm_scope: source.css.mss
- language_id: 53
-Ceylon:
- type: programming
- color: "#dfa535"
- extensions:
- - ".ceylon"
- tm_scope: source.ceylon
- ace_mode: text
- language_id: 54
-Chapel:
- type: programming
- color: "#8dc63f"
- aliases:
- - chpl
- extensions:
- - ".chpl"
- ace_mode: text
- language_id: 55
-Charity:
- type: programming
- extensions:
- - ".ch"
- tm_scope: none
- ace_mode: text
- language_id: 56
-ChucK:
- type: programming
- extensions:
- - ".ck"
- tm_scope: source.java
- ace_mode: java
- codemirror_mode: clike
- codemirror_mime_type: text/x-java
- language_id: 57
-Cirru:
- type: programming
- color: "#ccccff"
- ace_mode: cirru
- extensions:
- - ".cirru"
- language_id: 58
-Clarion:
- type: programming
- color: "#db901e"
- ace_mode: text
- extensions:
- - ".clw"
- tm_scope: source.clarion
- language_id: 59
-Clean:
- type: programming
- color: "#3F85AF"
- extensions:
- - ".icl"
- - ".dcl"
- tm_scope: source.clean
- ace_mode: text
- language_id: 60
-Click:
- type: programming
- color: "#E4E6F3"
- extensions:
- - ".click"
- tm_scope: source.click
- ace_mode: text
- language_id: 61
-Clojure:
- type: programming
- ace_mode: clojure
- codemirror_mode: clojure
- codemirror_mime_type: text/x-clojure
- color: "#db5855"
- extensions:
- - ".clj"
- - ".boot"
- - ".cl2"
- - ".cljc"
- - ".cljs"
- - ".cljs.hl"
- - ".cljscm"
- - ".cljx"
- - ".hic"
- filenames:
- - riemann.config
- language_id: 62
-Closure Templates:
- type: markup
- group: HTML
- ace_mode: soy_template
- codemirror_mode: soy
- codemirror_mime_type: text/x-soy
- alias:
- - soy
- extensions:
- - ".soy"
- tm_scope: text.html.soy
- language_id: 357046146
-CoNLL-U:
- type: data
- extensions:
- - ".conllu"
- - ".conll"
- tm_scope: text.conllu
- ace_mode: text
- aliases:
- - CoNLL
- - CoNLL-X
- language_id: 421026389
-CoffeeScript:
- type: programming
- tm_scope: source.coffee
- ace_mode: coffee
- codemirror_mode: coffeescript
- codemirror_mime_type: text/x-coffeescript
- color: "#244776"
- aliases:
- - coffee
- - coffee-script
- extensions:
- - ".coffee"
- - "._coffee"
- - ".cake"
- - ".cjsx"
- - ".iced"
- filenames:
- - Cakefile
- interpreters:
- - coffee
- language_id: 63
-ColdFusion:
- type: programming
- ace_mode: coldfusion
- color: "#ed2cd6"
- aliases:
- - cfm
- - cfml
- - coldfusion html
- extensions:
- - ".cfm"
- - ".cfml"
- tm_scope: text.html.cfm
- language_id: 64
-ColdFusion CFC:
- type: programming
- group: ColdFusion
- ace_mode: coldfusion
- aliases:
- - cfc
- extensions:
- - ".cfc"
- tm_scope: source.cfscript
- language_id: 65
-Common Lisp:
- type: programming
- tm_scope: source.lisp
- color: "#3fb68b"
- aliases:
- - lisp
- extensions:
- - ".lisp"
- - ".asd"
- - ".cl"
- - ".l"
- - ".lsp"
- - ".ny"
- - ".podsl"
- - ".sexp"
- interpreters:
- - lisp
- - sbcl
- - ccl
- - clisp
- - ecl
- ace_mode: lisp
- codemirror_mode: commonlisp
- codemirror_mime_type: text/x-common-lisp
- language_id: 66
-Common Workflow Language:
- alias: cwl
- type: programming
- ace_mode: yaml
- codemirror_mode: yaml
- codemirror_mime_type: text/x-yaml
- extensions:
- - ".cwl"
- interpreters:
- - cwl-runner
- color: "#B5314C"
- tm_scope: source.cwl
- language_id: 988547172
-Component Pascal:
- type: programming
- color: "#B0CE4E"
- extensions:
- - ".cp"
- - ".cps"
- tm_scope: source.pascal
- aliases:
- - delphi
- - objectpascal
- ace_mode: pascal
- codemirror_mode: pascal
- codemirror_mime_type: text/x-pascal
- language_id: 67
-Cool:
- type: programming
- extensions:
- - ".cl"
- tm_scope: source.cool
- ace_mode: text
- language_id: 68
-Coq:
- type: programming
- extensions:
- - ".coq"
- - ".v"
- ace_mode: text
- language_id: 69
-Cpp-ObjDump:
- type: data
- extensions:
- - ".cppobjdump"
- - ".c++-objdump"
- - ".c++objdump"
- - ".cpp-objdump"
- - ".cxx-objdump"
- tm_scope: objdump.x86asm
- aliases:
- - c++-objdump
- ace_mode: assembly_x86
- language_id: 70
-Creole:
- type: prose
- wrap: true
- extensions:
- - ".creole"
- tm_scope: text.html.creole
- ace_mode: text
- language_id: 71
-Crystal:
- type: programming
- color: "#776791"
- extensions:
- - ".cr"
- ace_mode: ruby
- codemirror_mode: crystal
- codemirror_mime_type: text/x-crystal
- tm_scope: source.crystal
- interpreters:
- - crystal
- language_id: 72
-Csound:
- type: programming
- aliases:
- - csound-orc
- extensions:
- - ".orc"
- - ".udo"
- tm_scope: source.csound
- ace_mode: csound_orchestra
- language_id: 73
-Csound Document:
- type: programming
- aliases:
- - csound-csd
- extensions:
- - ".csd"
- tm_scope: source.csound-document
- ace_mode: csound_document
- language_id: 74
-Csound Score:
- type: programming
- aliases:
- - csound-sco
- extensions:
- - ".sco"
- tm_scope: source.csound-score
- ace_mode: csound_score
- language_id: 75
-Cuda:
- type: programming
- extensions:
- - ".cu"
- - ".cuh"
- tm_scope: source.cuda-c++
- ace_mode: c_cpp
- codemirror_mode: clike
- codemirror_mime_type: text/x-c++src
- color: "#3A4E3A"
- language_id: 77
-Cycript:
- type: programming
- extensions:
- - ".cy"
- tm_scope: source.js
- ace_mode: javascript
- codemirror_mode: javascript
- codemirror_mime_type: text/javascript
- language_id: 78
-Cython:
- type: programming
- group: Python
- extensions:
- - ".pyx"
- - ".pxd"
- - ".pxi"
- aliases:
- - pyrex
- ace_mode: text
- codemirror_mode: python
- codemirror_mime_type: text/x-cython
- language_id: 79
-D:
- type: programming
- color: "#ba595e"
- extensions:
- - ".d"
- - ".di"
- ace_mode: d
- codemirror_mode: d
- codemirror_mime_type: text/x-d
- language_id: 80
-D-ObjDump:
- type: data
- extensions:
- - ".d-objdump"
- tm_scope: objdump.x86asm
- ace_mode: assembly_x86
- language_id: 81
-DIGITAL Command Language:
- type: programming
- aliases:
- - dcl
- extensions:
- - ".com"
- tm_scope: none
- ace_mode: text
- language_id: 82
-DM:
- type: programming
- color: "#447265"
- extensions:
- - ".dm"
- aliases:
- - byond
- tm_scope: source.dm
- ace_mode: c_cpp
- language_id: 83
-DNS Zone:
- type: data
- extensions:
- - ".zone"
- - ".arpa"
- tm_scope: text.zone_file
- ace_mode: text
- language_id: 84
-DTrace:
- type: programming
- aliases:
- - dtrace-script
- extensions:
- - ".d"
- interpreters:
- - dtrace
- tm_scope: source.c
- ace_mode: c_cpp
- codemirror_mode: clike
- codemirror_mime_type: text/x-csrc
- language_id: 85
-Darcs Patch:
- type: data
- aliases:
- - dpatch
- extensions:
- - ".darcspatch"
- - ".dpatch"
- tm_scope: none
- ace_mode: text
- language_id: 86
-Dart:
- type: programming
- color: "#00B4AB"
- extensions:
- - ".dart"
- interpreters:
- - dart
- ace_mode: dart
- codemirror_mode: dart
- codemirror_mime_type: application/dart
- language_id: 87
-DataWeave:
- type: programming
- color: "#003a52"
- extensions:
- - ".dwl"
- ace_mode: text
- tm_scope: source.data-weave
- language_id: 974514097
-Diff:
- type: data
- extensions:
- - ".diff"
- - ".patch"
- aliases:
- - udiff
- tm_scope: source.diff
- ace_mode: diff
- codemirror_mode: diff
- codemirror_mime_type: text/x-diff
- language_id: 88
-Dockerfile:
- type: data
- tm_scope: source.dockerfile
- extensions:
- - ".dockerfile"
- filenames:
- - Dockerfile
- ace_mode: dockerfile
- codemirror_mode: dockerfile
- codemirror_mime_type: text/x-dockerfile
- language_id: 89
-Dogescript:
- type: programming
- color: "#cca760"
- extensions:
- - ".djs"
- tm_scope: none
- ace_mode: text
- language_id: 90
-Dylan:
- type: programming
- color: "#6c616e"
- extensions:
- - ".dylan"
- - ".dyl"
- - ".intr"
- - ".lid"
- ace_mode: text
- codemirror_mode: dylan
- codemirror_mime_type: text/x-dylan
- language_id: 91
-E:
- type: programming
- color: "#ccce35"
- extensions:
- - ".E"
- interpreters:
- - rune
- tm_scope: none
- ace_mode: text
- language_id: 92
-EBNF:
- type: data
- extensions:
- - ".ebnf"
- tm_scope: source.ebnf
- ace_mode: text
- codemirror_mode: ebnf
- codemirror_mime_type: text/x-ebnf
- language_id: 430
-ECL:
- type: programming
- color: "#8a1267"
- extensions:
- - ".ecl"
- - ".eclxml"
- tm_scope: none
- ace_mode: text
- codemirror_mode: ecl
- codemirror_mime_type: text/x-ecl
- language_id: 93
-ECLiPSe:
- type: programming
- group: prolog
- extensions:
- - ".ecl"
- tm_scope: source.prolog.eclipse
- ace_mode: prolog
- language_id: 94
-EJS:
- type: markup
- group: HTML
- extensions:
- - ".ejs"
- tm_scope: text.html.js
- ace_mode: ejs
- language_id: 95
-EQ:
- type: programming
- color: "#a78649"
- extensions:
- - ".eq"
- tm_scope: source.cs
- ace_mode: csharp
- codemirror_mode: clike
- codemirror_mime_type: text/x-csharp
- language_id: 96
-Eagle:
- type: data
- extensions:
- - ".sch"
- - ".brd"
- tm_scope: text.xml
- ace_mode: xml
- codemirror_mode: xml
- codemirror_mime_type: text/xml
- language_id: 97
-Easybuild:
- type: data
- group: Python
- ace_mode: python
- codemirror_mode: python
- codemirror_mime_type: text/x-python
- tm_scope: source.python
- extensions:
- - ".eb"
- language_id: 342840477
-Ecere Projects:
- type: data
- group: JavaScript
- extensions:
- - ".epj"
- tm_scope: source.json
- ace_mode: json
- codemirror_mode: javascript
- codemirror_mime_type: application/json
- language_id: 98
-Edje Data Collection:
- type: data
- extensions:
- - ".edc"
- tm_scope: source.json
- ace_mode: json
- codemirror_mode: javascript
- codemirror_mime_type: application/json
- language_id: 342840478
-Eiffel:
- type: programming
- color: "#946d57"
- extensions:
- - ".e"
- ace_mode: eiffel
- codemirror_mode: eiffel
- codemirror_mime_type: text/x-eiffel
- language_id: 99
-Elixir:
- type: programming
- color: "#6e4a7e"
- extensions:
- - ".ex"
- - ".exs"
- ace_mode: elixir
- filenames:
- - mix.lock
- interpreters:
- - elixir
- language_id: 100
-Elm:
- type: programming
- color: "#60B5CC"
- extensions:
- - ".elm"
- tm_scope: source.elm
- ace_mode: elm
- codemirror_mode: elm
- codemirror_mime_type: text/x-elm
- language_id: 101
-Emacs Lisp:
- type: programming
- tm_scope: source.emacs.lisp
- color: "#c065db"
- aliases:
- - elisp
- - emacs
- filenames:
- - ".abbrev_defs"
- - ".emacs"
- - ".emacs.desktop"
- - ".gnus"
- - ".spacemacs"
- - ".viper"
- - Cask
- - Project.ede
- - _emacs
- - abbrev_defs
- extensions:
- - ".el"
- - ".emacs"
- - ".emacs.desktop"
- ace_mode: lisp
- codemirror_mode: commonlisp
- codemirror_mime_type: text/x-common-lisp
- language_id: 102
-EmberScript:
- type: programming
- color: "#FFF4F3"
- extensions:
- - ".em"
- - ".emberscript"
- tm_scope: source.coffee
- ace_mode: coffee
- codemirror_mode: coffeescript
- codemirror_mime_type: text/x-coffeescript
- language_id: 103
-Erlang:
- type: programming
- color: "#B83998"
- extensions:
- - ".erl"
- - ".app.src"
- - ".es"
- - ".escript"
- - ".hrl"
- - ".xrl"
- - ".yrl"
- filenames:
- - Emakefile
- - rebar.config
- - rebar.config.lock
- - rebar.lock
- ace_mode: erlang
- codemirror_mode: erlang
- codemirror_mime_type: text/x-erlang
- interpreters:
- - escript
- language_id: 104
-F#:
- type: programming
- color: "#b845fc"
- aliases:
- - fsharp
- extensions:
- - ".fs"
- - ".fsi"
- - ".fsx"
- tm_scope: source.fsharp
- ace_mode: text
- codemirror_mode: mllike
- codemirror_mime_type: text/x-fsharp
- language_id: 105
-FLUX:
- type: programming
- color: "#88ccff"
- extensions:
- - ".fx"
- - ".flux"
- tm_scope: none
- ace_mode: text
- language_id: 106
-Factor:
- type: programming
- color: "#636746"
- extensions:
- - ".factor"
- filenames:
- - ".factor-boot-rc"
- - ".factor-rc"
- ace_mode: text
- codemirror_mode: factor
- codemirror_mime_type: text/x-factor
- language_id: 108
-Fancy:
- type: programming
- color: "#7b9db4"
- extensions:
- - ".fy"
- - ".fancypack"
- filenames:
- - Fakefile
- ace_mode: text
- language_id: 109
-Fantom:
- type: programming
- color: "#14253c"
- extensions:
- - ".fan"
- tm_scope: source.fan
- ace_mode: text
- language_id: 110
-Filebench WML:
- type: programming
- extensions:
- - ".f"
- tm_scope: none
- ace_mode: text
- language_id: 111
-Filterscript:
- type: programming
- group: RenderScript
- extensions:
- - ".fs"
- tm_scope: none
- ace_mode: text
- language_id: 112
-Formatted:
- type: data
- extensions:
- - ".for"
- - ".eam.fs"
- tm_scope: none
- ace_mode: text
- language_id: 113
-Forth:
- type: programming
- color: "#341708"
- extensions:
- - ".fth"
- - ".4th"
- - ".f"
- - ".for"
- - ".forth"
- - ".fr"
- - ".frt"
- - ".fs"
- ace_mode: forth
- codemirror_mode: forth
- codemirror_mime_type: text/x-forth
- language_id: 114
-Fortran:
- type: programming
- color: "#4d41b1"
- extensions:
- - ".f90"
- - ".f"
- - ".f03"
- - ".f08"
- - ".f77"
- - ".f95"
- - ".for"
- - ".fpp"
- tm_scope: source.fortran.modern
- ace_mode: text
- codemirror_mode: fortran
- codemirror_mime_type: text/x-fortran
- language_id: 107
-FreeMarker:
- type: programming
- color: "#0050b2"
- aliases:
- - ftl
- extensions:
- - ".ftl"
- tm_scope: text.html.ftl
- ace_mode: ftl
- language_id: 115
-Frege:
- type: programming
- color: "#00cafe"
- extensions:
- - ".fr"
- tm_scope: source.haskell
- ace_mode: haskell
- language_id: 116
-G-code:
- type: data
- extensions:
- - ".g"
- - ".gco"
- - ".gcode"
- tm_scope: source.gcode
- ace_mode: gcode
- language_id: 117
-GAMS:
- type: programming
- extensions:
- - ".gms"
- tm_scope: none
- ace_mode: text
- language_id: 118
-GAP:
- type: programming
- extensions:
- - ".g"
- - ".gap"
- - ".gd"
- - ".gi"
- - ".tst"
- tm_scope: source.gap
- ace_mode: text
- language_id: 119
-GCC Machine Description:
- type: programming
- extensions:
- - ".md"
- tm_scope: source.lisp
- ace_mode: lisp
- codemirror_mode: commonlisp
- codemirror_mime_type: text/x-common-lisp
- language_id: 121
-GDB:
- type: programming
- extensions:
- - ".gdb"
- - ".gdbinit"
- tm_scope: source.gdb
- ace_mode: text
- language_id: 122
-GDScript:
- type: programming
- extensions:
- - ".gd"
- tm_scope: source.gdscript
- ace_mode: text
- language_id: 123
-GLSL:
- type: programming
- extensions:
- - ".glsl"
- - ".fp"
- - ".frag"
- - ".frg"
- - ".fs"
- - ".fsh"
- - ".fshader"
- - ".geo"
- - ".geom"
- - ".glslv"
- - ".gshader"
- - ".shader"
- - ".tesc"
- - ".tese"
- - ".vert"
- - ".vrx"
- - ".vsh"
- - ".vshader"
- ace_mode: glsl
- language_id: 124
-GN:
- type: data
- extensions:
- - ".gn"
- - ".gni"
- interpreters:
- - gn
- tm_scope: source.gn
- ace_mode: python
- codemirror_mode: python
- codemirror_mime_type: text/x-python
- language_id: 302957008
-Game Maker Language:
- type: programming
- color: "#8fb200"
- extensions:
- - ".gml"
- tm_scope: source.c++
- ace_mode: c_cpp
- codemirror_mode: clike
- codemirror_mime_type: text/x-c++src
- language_id: 125
-Genie:
- type: programming
- ace_mode: text
- extensions:
- - ".gs"
- color: "#fb855d"
- tm_scope: none
- language_id: 792408528
-Genshi:
- type: programming
- extensions:
- - ".kid"
- tm_scope: text.xml.genshi
- aliases:
- - xml+genshi
- - xml+kid
- ace_mode: xml
- codemirror_mode: xml
- codemirror_mime_type: text/xml
- language_id: 126
-Gentoo Ebuild:
- type: programming
- group: Shell
- extensions:
- - ".ebuild"
- tm_scope: source.shell
- ace_mode: sh
- codemirror_mode: shell
- codemirror_mime_type: text/x-sh
- language_id: 127
-Gentoo Eclass:
- type: programming
- group: Shell
- extensions:
- - ".eclass"
- tm_scope: source.shell
- ace_mode: sh
- codemirror_mode: shell
- codemirror_mime_type: text/x-sh
- language_id: 128
-Gerber Image:
- type: data
- aliases:
- - rs-274x
- extensions:
- - ".gbr"
- - ".gbl"
- - ".gbo"
- - ".gbp"
- - ".gbs"
- - ".gko"
- - ".gpb"
- - ".gpt"
- - ".gtl"
- - ".gto"
- - ".gtp"
- - ".gts"
- interpreters:
- - gerbv
- - gerbview
- tm_scope: source.gerber
- ace_mode: text
- language_id: 404627610
-Gettext Catalog:
- type: prose
- searchable: false
- aliases:
- - pot
- extensions:
- - ".po"
- - ".pot"
- tm_scope: source.po
- ace_mode: text
- language_id: 129
-Gherkin:
- type: programming
- extensions:
- - ".feature"
- tm_scope: text.gherkin.feature
- aliases:
- - cucumber
- ace_mode: text
- color: "#5B2063"
- language_id: 76
-Glyph:
- type: programming
- color: "#e4cc98"
- extensions:
- - ".glf"
- tm_scope: source.tcl
- ace_mode: tcl
- codemirror_mode: tcl
- codemirror_mime_type: text/x-tcl
- language_id: 130
-Gnuplot:
- type: programming
- color: "#f0a9f0"
- extensions:
- - ".gp"
- - ".gnu"
- - ".gnuplot"
- - ".plot"
- - ".plt"
- interpreters:
- - gnuplot
- ace_mode: text
- language_id: 131
-Go:
- type: programming
- color: "#375eab"
- aliases:
- - golang
- extensions:
- - ".go"
- ace_mode: golang
- codemirror_mode: go
- codemirror_mime_type: text/x-go
- language_id: 132
-Golo:
- type: programming
- color: "#88562A"
- extensions:
- - ".golo"
- tm_scope: source.golo
- ace_mode: text
- language_id: 133
-Gosu:
- type: programming
- color: "#82937f"
- extensions:
- - ".gs"
- - ".gst"
- - ".gsx"
- - ".vark"
- tm_scope: source.gosu.2
- ace_mode: text
- language_id: 134
-Grace:
- type: programming
- extensions:
- - ".grace"
- tm_scope: source.grace
- ace_mode: text
- language_id: 135
-Gradle:
- type: data
- extensions:
- - ".gradle"
- tm_scope: source.groovy.gradle
- ace_mode: text
- language_id: 136
-Grammatical Framework:
- type: programming
- aliases:
- - gf
- wrap: false
- extensions:
- - ".gf"
- searchable: true
- color: "#79aa7a"
- tm_scope: source.haskell
- ace_mode: haskell
- codemirror_mode: haskell
- codemirror_mime_type: text/x-haskell
- language_id: 137
-Graph Modeling Language:
- type: data
- extensions:
- - ".gml"
- tm_scope: none
- ace_mode: text
- language_id: 138
-GraphQL:
- type: data
- extensions:
- - ".graphql"
- - ".gql"
- tm_scope: source.graphql
- ace_mode: text
- language_id: 139
-Graphviz (DOT):
- type: data
- tm_scope: source.dot
- extensions:
- - ".dot"
- - ".gv"
- ace_mode: text
- language_id: 140
-Groovy:
- type: programming
- ace_mode: groovy
- codemirror_mode: groovy
- codemirror_mime_type: text/x-groovy
- color: "#e69f56"
- extensions:
- - ".groovy"
- - ".grt"
- - ".gtpl"
- - ".gvy"
- interpreters:
- - groovy
- filenames:
- - Jenkinsfile
- language_id: 142
-Groovy Server Pages:
- type: programming
- group: Groovy
- aliases:
- - gsp
- - java server page
- extensions:
- - ".gsp"
- tm_scope: text.html.jsp
- ace_mode: jsp
- codemirror_mode: htmlembedded
- codemirror_mime_type: application/x-jsp
- language_id: 143
-HCL:
- type: programming
- extensions:
- - ".hcl"
- - ".tf"
- - ".tfvars"
- ace_mode: ruby
- codemirror_mode: ruby
- codemirror_mime_type: text/x-ruby
- tm_scope: source.terraform
- language_id: 144
-HLSL:
- type: programming
- extensions:
- - ".hlsl"
- - ".cginc"
- - ".fx"
- - ".fxh"
- - ".hlsli"
- ace_mode: text
- tm_scope: source.hlsl
- language_id: 145
-HTML:
- type: markup
- tm_scope: text.html.basic
- ace_mode: html
- codemirror_mode: htmlmixed
- codemirror_mime_type: text/html
- color: "#e34c26"
- aliases:
- - xhtml
- extensions:
- - ".html"
- - ".htm"
- - ".html.hl"
- - ".inc"
- - ".st"
- - ".xht"
- - ".xhtml"
- language_id: 146
-HTML+Django:
- type: markup
- tm_scope: text.html.django
- group: HTML
- extensions:
- - ".jinja"
- - ".mustache"
- - ".njk"
- aliases:
- - django
- - html+django/jinja
- - html+jinja
- - htmldjango
- - njk
- - nunjucks
- ace_mode: django
- codemirror_mode: django
- codemirror_mime_type: text/x-django
- language_id: 147
-HTML+ECR:
- type: markup
- tm_scope: text.html.ecr
- group: HTML
- aliases:
- - ecr
- extensions:
- - ".ecr"
- ace_mode: text
- codemirror_mode: htmlmixed
- codemirror_mime_type: text/html
- language_id: 148
-HTML+EEX:
- type: markup
- tm_scope: text.html.elixir
- group: HTML
- aliases:
- - eex
- extensions:
- - ".eex"
- ace_mode: text
- codemirror_mode: htmlmixed
- codemirror_mime_type: text/html
- language_id: 149
-HTML+ERB:
- type: markup
- tm_scope: text.html.erb
- group: HTML
- aliases:
- - erb
- extensions:
- - ".erb"
- - ".erb.deface"
- ace_mode: text
- codemirror_mode: htmlembedded
- codemirror_mime_type: application/x-erb
- language_id: 150
-HTML+PHP:
- type: markup
- tm_scope: text.html.php
- group: HTML
- extensions:
- - ".phtml"
- ace_mode: php
- codemirror_mode: php
- codemirror_mime_type: application/x-httpd-php
- language_id: 151
-HTTP:
- type: data
- extensions:
- - ".http"
- tm_scope: source.httpspec
- ace_mode: text
- codemirror_mode: http
- codemirror_mime_type: message/http
- language_id: 152
-Hack:
- type: programming
- ace_mode: php
- codemirror_mode: php
- codemirror_mime_type: application/x-httpd-php
- extensions:
- - ".hh"
- - ".php"
- tm_scope: text.html.php
- color: "#878787"
- language_id: 153
-Haml:
- group: HTML
- type: markup
- extensions:
- - ".haml"
- - ".haml.deface"
- ace_mode: haml
- codemirror_mode: haml
- codemirror_mime_type: text/x-haml
- language_id: 154
-Handlebars:
- type: markup
- group: HTML
- aliases:
- - hbs
- - htmlbars
- extensions:
- - ".handlebars"
- - ".hbs"
- tm_scope: text.html.handlebars
- ace_mode: handlebars
- language_id: 155
-Harbour:
- type: programming
- color: "#0e60e3"
- extensions:
- - ".hb"
- tm_scope: source.harbour
- ace_mode: text
- language_id: 156
-Haskell:
- type: programming
- color: "#5e5086"
- extensions:
- - ".hs"
- - ".hsc"
- interpreters:
- - runhaskell
- ace_mode: haskell
- codemirror_mode: haskell
- codemirror_mime_type: text/x-haskell
- language_id: 157
-Haxe:
- type: programming
- ace_mode: haxe
- codemirror_mode: haxe
- codemirror_mime_type: text/x-haxe
- color: "#df7900"
- extensions:
- - ".hx"
- - ".hxsl"
- tm_scope: source.haxe.2
- language_id: 158
-Hy:
- type: programming
- ace_mode: text
- color: "#7790B2"
- extensions:
- - ".hy"
- aliases:
- - hylang
- tm_scope: none
- language_id: 159
-HyPhy:
- type: programming
- ace_mode: text
- extensions:
- - ".bf"
- tm_scope: none
- language_id: 160
-IDL:
- type: programming
- color: "#a3522f"
- extensions:
- - ".pro"
- - ".dlm"
- ace_mode: text
- codemirror_mode: idl
- codemirror_mime_type: text/x-idl
- language_id: 161
-IGOR Pro:
- type: programming
- extensions:
- - ".ipf"
- aliases:
- - igor
- - igorpro
- tm_scope: none
- ace_mode: text
- language_id: 162
-INI:
- type: data
- extensions:
- - ".ini"
- - ".cfg"
- - ".prefs"
- - ".pro"
- - ".properties"
- filenames:
- - buildozer.spec
- tm_scope: source.ini
- aliases:
- - dosini
- ace_mode: ini
- codemirror_mode: properties
- codemirror_mime_type: text/x-properties
- language_id: 163
-IRC log:
- type: data
- aliases:
- - irc
- - irc logs
- extensions:
- - ".irclog"
- - ".weechatlog"
- tm_scope: none
- ace_mode: text
- codemirror_mode: mirc
- codemirror_mime_type: text/mirc
- language_id: 164
-Idris:
- type: programming
- color: "#b30000"
- extensions:
- - ".idr"
- - ".lidr"
- ace_mode: text
- tm_scope: source.idris
- language_id: 165
-Inform 7:
- type: programming
- wrap: true
- extensions:
- - ".ni"
- - ".i7x"
- tm_scope: source.inform7
- aliases:
- - i7
- - inform7
- ace_mode: text
- language_id: 166
-Inno Setup:
- type: programming
- extensions:
- - ".iss"
- tm_scope: none
- ace_mode: text
- language_id: 167
-Io:
- type: programming
- color: "#a9188d"
- extensions:
- - ".io"
- interpreters:
- - io
- ace_mode: io
- language_id: 168
-Ioke:
- type: programming
- color: "#078193"
- extensions:
- - ".ik"
- interpreters:
- - ioke
- ace_mode: text
- language_id: 169
-Isabelle:
- type: programming
- color: "#FEFE00"
- extensions:
- - ".thy"
- tm_scope: source.isabelle.theory
- ace_mode: text
- language_id: 170
-Isabelle ROOT:
- type: programming
- group: Isabelle
- filenames:
- - ROOT
- tm_scope: source.isabelle.root
- ace_mode: text
- language_id: 171
-J:
- type: programming
- color: "#9EEDFF"
- extensions:
- - ".ijs"
- interpreters:
- - jconsole
- tm_scope: source.j
- ace_mode: text
- language_id: 172
-JFlex:
- type: programming
- group: Lex
- extensions:
- - ".flex"
- - ".jflex"
- tm_scope: source.jflex
- ace_mode: text
- language_id: 173
-JSON:
- type: data
- tm_scope: source.json
- group: JavaScript
- ace_mode: json
- codemirror_mode: javascript
- codemirror_mime_type: application/json
- searchable: false
- extensions:
- - ".json"
- - ".avsc"
- - ".geojson"
- - ".gltf"
- - ".JSON-tmLanguage"
- - ".jsonl"
- - ".tfstate"
- - ".tfstate.backup"
- - ".topojson"
- - ".webapp"
- - ".webmanifest"
- filenames:
- - ".arcconfig"
- - ".htmlhintrc"
- - ".jscsrc"
- - ".jshintrc"
- - ".tern-config"
- - ".tern-project"
- - composer.lock
- - mcmod.info
- language_id: 174
-JSON5:
- type: data
- extensions:
- - ".json5"
- filenames:
- - ".babelrc"
- - ".jslintrc"
- tm_scope: source.js
- ace_mode: javascript
- codemirror_mode: javascript
- codemirror_mime_type: application/json
- language_id: 175
-JSONLD:
- type: data
- group: JavaScript
- ace_mode: javascript
- extensions:
- - ".jsonld"
- tm_scope: source.js
- language_id: 176
-JSONiq:
- color: "#40d47e"
- type: programming
- ace_mode: jsoniq
- codemirror_mode: javascript
- codemirror_mime_type: application/json
- extensions:
- - ".jq"
- tm_scope: source.jq
- language_id: 177
-JSX:
- type: programming
- group: JavaScript
- extensions:
- - ".jsx"
- tm_scope: source.js.jsx
- ace_mode: javascript
- codemirror_mode: jsx
- codemirror_mime_type: text/jsx
- language_id: 178
-Jasmin:
- type: programming
- ace_mode: java
- extensions:
- - ".j"
- tm_scope: source.jasmin
- language_id: 180
-Java:
- type: programming
- ace_mode: java
- codemirror_mode: clike
- codemirror_mime_type: text/x-java
- color: "#b07219"
- extensions:
- - ".java"
- language_id: 181
-Java Server Pages:
- type: programming
- group: Java
- aliases:
- - jsp
- extensions:
- - ".jsp"
- tm_scope: text.html.jsp
- ace_mode: jsp
- codemirror_mode: htmlembedded
- codemirror_mime_type: application/x-jsp
- language_id: 182
-JavaScript:
- type: programming
- tm_scope: source.js
- ace_mode: javascript
- codemirror_mode: javascript
- codemirror_mime_type: text/javascript
- color: "#f1e05a"
- aliases:
- - js
- - node
- extensions:
- - ".js"
- - "._js"
- - ".bones"
- - ".es"
- - ".es6"
- - ".frag"
- - ".gs"
- - ".jake"
- - ".jsb"
- - ".jscad"
- - ".jsfl"
- - ".jsm"
- - ".jss"
- - ".mjs"
- - ".njs"
- - ".pac"
- - ".sjs"
- - ".ssjs"
- - ".xsjs"
- - ".xsjslib"
- filenames:
- - Jakefile
- interpreters:
- - node
- language_id: 183
-Jison:
- type: programming
- group: Yacc
- extensions:
- - ".jison"
- tm_scope: source.jison
- ace_mode: text
- language_id: 284531423
-Jison Lex:
- type: programming
- group: Lex
- extensions:
- - ".jisonlex"
- tm_scope: source.jisonlex
- ace_mode: text
- language_id: 406395330
-Jolie:
- type: programming
- extensions:
- - ".ol"
- - ".iol"
- interpreters:
- - jolie
- color: "#843179"
- ace_mode: text
- tm_scope: source.jolie
- language_id: 998078858
-Julia:
- type: programming
- extensions:
- - ".jl"
- interpreters:
- - julia
- color: "#a270ba"
- ace_mode: julia
- codemirror_mode: julia
- codemirror_mime_type: text/x-julia
- language_id: 184
-Jupyter Notebook:
- type: markup
- ace_mode: json
- codemirror_mode: javascript
- codemirror_mime_type: application/json
- tm_scope: source.json
- color: "#DA5B0B"
- extensions:
- - ".ipynb"
- filenames:
- - Notebook
- aliases:
- - IPython Notebook
- language_id: 185
-KRL:
- type: programming
- color: "#28431f"
- extensions:
- - ".krl"
- tm_scope: none
- ace_mode: text
- language_id: 186
-KiCad Layout:
- type: data
- aliases:
- - pcbnew
- extensions:
- - ".kicad_pcb"
- - ".kicad_mod"
- - ".kicad_wks"
- filenames:
- - fp-lib-table
- tm_scope: source.pcb.sexp
- ace_mode: lisp
- codemirror_mode: commonlisp
- codemirror_mime_type: text/x-common-lisp
- language_id: 187
-KiCad Legacy Layout:
- type: data
- extensions:
- - ".brd"
- tm_scope: source.pcb.board
- ace_mode: text
- language_id: 140848857
-KiCad Schematic:
- type: data
- aliases:
- - eeschema schematic
- extensions:
- - ".sch"
- tm_scope: source.pcb.schematic
- ace_mode: text
- language_id: 622447435
-Kit:
- type: markup
- ace_mode: html
- codemirror_mode: htmlmixed
- codemirror_mime_type: text/html
- extensions:
- - ".kit"
- tm_scope: text.html.basic
- language_id: 188
-Kotlin:
- type: programming
- color: "#F18E33"
- extensions:
- - ".kt"
- - ".ktm"
- - ".kts"
- tm_scope: source.Kotlin
- ace_mode: text
- codemirror_mode: clike
- codemirror_mime_type: text/x-kotlin
- language_id: 189
-LFE:
- type: programming
- color: "#4C3023"
- extensions:
- - ".lfe"
- tm_scope: source.lisp
- ace_mode: lisp
- codemirror_mode: commonlisp
- codemirror_mime_type: text/x-common-lisp
- language_id: 190
-LLVM:
- type: programming
- extensions:
- - ".ll"
- ace_mode: text
- color: "#185619"
- language_id: 191
-LOLCODE:
- type: programming
- extensions:
- - ".lol"
- color: "#cc9900"
- tm_scope: none
- ace_mode: text
- language_id: 192
-LSL:
- type: programming
- ace_mode: lsl
- extensions:
- - ".lsl"
- - ".lslp"
- interpreters:
- - lsl
- color: "#3d9970"
- language_id: 193
-LabVIEW:
- type: programming
- extensions:
- - ".lvproj"
- tm_scope: text.xml
- ace_mode: xml
- codemirror_mode: xml
- codemirror_mime_type: text/xml
- language_id: 194
-Lasso:
- type: programming
- color: "#999999"
- extensions:
- - ".lasso"
- - ".las"
- - ".lasso8"
- - ".lasso9"
- - ".ldml"
- tm_scope: file.lasso
- aliases:
- - lassoscript
- ace_mode: text
- language_id: 195
-Latte:
- type: markup
- group: HTML
- extensions:
- - ".latte"
- tm_scope: text.html.smarty
- ace_mode: smarty
- codemirror_mode: smarty
- codemirror_mime_type: text/x-smarty
- language_id: 196
-Lean:
- type: programming
- extensions:
- - ".lean"
- - ".hlean"
- ace_mode: text
- language_id: 197
-Less:
- type: markup
- group: CSS
- extensions:
- - ".less"
- tm_scope: source.css.less
- ace_mode: less
- codemirror_mode: css
- codemirror_mime_type: text/css
- language_id: 198
-Lex:
- type: programming
- color: "#DBCA00"
- aliases:
- - flex
- extensions:
- - ".l"
- - ".lex"
- tm_scope: none
- ace_mode: text
- language_id: 199
-LilyPond:
- type: programming
- extensions:
- - ".ly"
- - ".ily"
- ace_mode: text
- language_id: 200
-Limbo:
- type: programming
- extensions:
- - ".b"
- - ".m"
- tm_scope: none
- ace_mode: text
- language_id: 201
-Linker Script:
- type: data
- extensions:
- - ".ld"
- - ".lds"
- - ".x"
- filenames:
- - ld.script
- tm_scope: none
- ace_mode: text
- language_id: 202
-Linux Kernel Module:
- type: data
- extensions:
- - ".mod"
- tm_scope: none
- ace_mode: text
- language_id: 203
-Liquid:
- type: markup
- extensions:
- - ".liquid"
- tm_scope: text.html.liquid
- ace_mode: liquid
- language_id: 204
-Literate Agda:
- type: programming
- group: Agda
- extensions:
- - ".lagda"
- tm_scope: none
- ace_mode: text
- language_id: 205
-Literate CoffeeScript:
- type: programming
- tm_scope: source.litcoffee
- group: CoffeeScript
- ace_mode: text
- wrap: true
- aliases:
- - litcoffee
- extensions:
- - ".litcoffee"
- language_id: 206
-Literate Haskell:
- type: programming
- group: Haskell
- aliases:
- - lhaskell
- - lhs
- extensions:
- - ".lhs"
- tm_scope: text.tex.latex.haskell
- ace_mode: text
- codemirror_mode: haskell-literate
- codemirror_mime_type: text/x-literate-haskell
- language_id: 207
-LiveScript:
- type: programming
- color: "#499886"
- aliases:
- - live-script
- - ls
- extensions:
- - ".ls"
- - "._ls"
- filenames:
- - Slakefile
- ace_mode: livescript
- codemirror_mode: livescript
- codemirror_mime_type: text/x-livescript
- language_id: 208
-Logos:
- type: programming
- extensions:
- - ".xm"
- - ".x"
- - ".xi"
- ace_mode: text
- tm_scope: source.logos
- language_id: 209
-Logtalk:
- type: programming
- extensions:
- - ".lgt"
- - ".logtalk"
- ace_mode: text
- language_id: 210
-LookML:
- type: programming
- ace_mode: yaml
- codemirror_mode: yaml
- codemirror_mime_type: text/x-yaml
- color: "#652B81"
- extensions:
- - ".lookml"
- - ".model.lkml"
- - ".view.lkml"
- tm_scope: source.yaml
- language_id: 211
-LoomScript:
- type: programming
- extensions:
- - ".ls"
- tm_scope: source.loomscript
- ace_mode: text
- language_id: 212
-Lua:
- type: programming
- ace_mode: lua
- codemirror_mode: lua
- codemirror_mime_type: text/x-lua
- color: "#000080"
- extensions:
- - ".lua"
- - ".fcgi"
- - ".nse"
- - ".pd_lua"
- - ".rbxs"
- - ".wlua"
- interpreters:
- - lua
- language_id: 213
-M:
- type: programming
- aliases:
- - mumps
- extensions:
- - ".mumps"
- - ".m"
- ace_mode: text
- codemirror_mode: mumps
- codemirror_mime_type: text/x-mumps
- language_id: 214
- tm_scope: none
-M4:
- type: programming
- extensions:
- - ".m4"
- tm_scope: none
- ace_mode: text
- language_id: 215
-M4Sugar:
- type: programming
- group: M4
- aliases:
- - autoconf
- extensions:
- - ".m4"
- filenames:
- - configure.ac
- tm_scope: none
- ace_mode: text
- language_id: 216
-MAXScript:
- type: programming
- color: "#00a6a6"
- extensions:
- - ".ms"
- - ".mcr"
- tm_scope: source.maxscript
- ace_mode: text
- language_id: 217
-MQL4:
- type: programming
- color: "#62A8D6"
- extensions:
- - ".mq4"
- - ".mqh"
- tm_scope: source.mql5
- ace_mode: c_cpp
- language_id: 426
-MQL5:
- type: programming
- color: "#4A76B8"
- extensions:
- - ".mq5"
- - ".mqh"
- tm_scope: source.mql5
- ace_mode: c_cpp
- language_id: 427
-MTML:
- type: markup
- color: "#b7e1f4"
- extensions:
- - ".mtml"
- tm_scope: text.html.basic
- ace_mode: html
- codemirror_mode: htmlmixed
- codemirror_mime_type: text/html
- language_id: 218
-MUF:
- type: programming
- group: Forth
- extensions:
- - ".muf"
- - ".m"
- tm_scope: none
- ace_mode: forth
- codemirror_mode: forth
- codemirror_mime_type: text/x-forth
- language_id: 219
-Makefile:
- type: programming
- color: "#427819"
- aliases:
- - bsdmake
- - make
- - mf
- extensions:
- - ".mak"
- - ".d"
- - ".make"
- - ".mk"
- - ".mkfile"
- filenames:
- - BSDmakefile
- - GNUmakefile
- - Kbuild
- - Makefile
- - Makefile.am
- - Makefile.boot
- - Makefile.frag
- - Makefile.in
- - Makefile.inc
- - Makefile.wat
- - makefile
- - makefile.sco
- - mkfile
- interpreters:
- - make
- ace_mode: makefile
- codemirror_mode: cmake
- codemirror_mime_type: text/x-cmake
- language_id: 220
-Mako:
- type: programming
- extensions:
- - ".mako"
- - ".mao"
- tm_scope: text.html.mako
- ace_mode: text
- language_id: 221
-Markdown:
- type: prose
- aliases:
- - pandoc
- ace_mode: markdown
- codemirror_mode: gfm
- codemirror_mime_type: text/x-gfm
- wrap: true
- extensions:
- - ".md"
- - ".markdown"
- - ".mdown"
- - ".mdwn"
- - ".mkd"
- - ".mkdn"
- - ".mkdown"
- - ".ron"
- - ".workbook"
- tm_scope: source.gfm
- language_id: 222
-Marko:
- group: HTML
- type: markup
- tm_scope: text.marko
- extensions:
- - ".marko"
- aliases:
- - markojs
- ace_mode: text
- codemirror_mode: htmlmixed
- codemirror_mime_type: text/html
- language_id: 932782397
-Mask:
- type: markup
- color: "#f97732"
- ace_mode: mask
- extensions:
- - ".mask"
- tm_scope: source.mask
- language_id: 223
-Mathematica:
- type: programming
- extensions:
- - ".mathematica"
- - ".cdf"
- - ".m"
- - ".ma"
- - ".mt"
- - ".nb"
- - ".nbp"
- - ".wl"
- - ".wlt"
- aliases:
- - mma
- ace_mode: text
- codemirror_mode: mathematica
- codemirror_mime_type: text/x-mathematica
- language_id: 224
-Matlab:
- type: programming
- color: "#e16737"
- aliases:
- - octave
- extensions:
- - ".matlab"
- - ".m"
- ace_mode: matlab
- codemirror_mode: octave
- codemirror_mime_type: text/x-octave
- language_id: 225
-Maven POM:
- type: data
- tm_scope: text.xml.pom
- filenames:
- - pom.xml
- ace_mode: xml
- codemirror_mode: xml
- codemirror_mime_type: text/xml
- language_id: 226
-Max:
- type: programming
- color: "#c4a79c"
- aliases:
- - max/msp
- - maxmsp
- extensions:
- - ".maxpat"
- - ".maxhelp"
- - ".maxproj"
- - ".mxt"
- - ".pat"
- tm_scope: source.json
- ace_mode: json
- codemirror_mode: javascript
- codemirror_mime_type: application/json
- language_id: 227
-MediaWiki:
- type: prose
- wrap: true
- extensions:
- - ".mediawiki"
- - ".wiki"
- tm_scope: text.html.mediawiki
- ace_mode: text
- language_id: 228
-Mercury:
- type: programming
- color: "#ff2b2b"
- ace_mode: prolog
- interpreters:
- - mmi
- extensions:
- - ".m"
- - ".moo"
- tm_scope: source.mercury
- language_id: 229
-Meson:
- type: programming
- color: "#007800"
- filenames:
- - meson.build
- - meson_options.txt
- tm_scope: source.meson
- ace_mode: text
- language_id: 799141244
-Metal:
- type: programming
- color: "#8f14e9"
- extensions:
- - ".metal"
- tm_scope: source.c++
- ace_mode: c_cpp
- codemirror_mode: clike
- codemirror_mime_type: text/x-c++src
- language_id: 230
-MiniD:
- type: programming
- searchable: false
- extensions:
- - ".minid"
- tm_scope: none
- ace_mode: text
- language_id: 231
-Mirah:
- type: programming
- color: "#c7a938"
- extensions:
- - ".druby"
- - ".duby"
- - ".mir"
- - ".mirah"
- tm_scope: source.ruby
- ace_mode: ruby
- codemirror_mode: ruby
- codemirror_mime_type: text/x-ruby
- language_id: 232
-Modelica:
- type: programming
- extensions:
- - ".mo"
- tm_scope: source.modelica
- ace_mode: text
- codemirror_mode: modelica
- codemirror_mime_type: text/x-modelica
- language_id: 233
-Modula-2:
- type: programming
- extensions:
- - ".mod"
- tm_scope: source.modula2
- ace_mode: text
- language_id: 234
-Module Management System:
- type: programming
- extensions:
- - ".mms"
- - ".mmk"
- filenames:
- - descrip.mmk
- - descrip.mms
- tm_scope: none
- ace_mode: text
- language_id: 235
-Monkey:
- type: programming
- extensions:
- - ".monkey"
- - ".monkey2"
- ace_mode: text
- tm_scope: source.monkey
- language_id: 236
-Moocode:
- type: programming
- extensions:
- - ".moo"
- tm_scope: none
- ace_mode: text
- language_id: 237
-MoonScript:
- type: programming
- extensions:
- - ".moon"
- interpreters:
- - moon
- ace_mode: text
- language_id: 238
-Myghty:
- type: programming
- extensions:
- - ".myt"
- tm_scope: none
- ace_mode: text
- language_id: 239
-NCL:
- type: programming
- color: "#28431f"
- extensions:
- - ".ncl"
- tm_scope: source.ncl
- ace_mode: text
- language_id: 240
-NL:
- type: data
- extensions:
- - ".nl"
- tm_scope: none
- ace_mode: text
- language_id: 241
-NSIS:
- type: programming
- extensions:
- - ".nsi"
- - ".nsh"
- ace_mode: text
- codemirror_mode: nsis
- codemirror_mime_type: text/x-nsis
- language_id: 242
-Nearley:
- type: programming
- ace_mode: text
- color: "#990000"
- extensions:
- - ".ne"
- - ".nearley"
- tm_scope: source.ne
- language_id: 521429430
-Nemerle:
- type: programming
- color: "#3d3c6e"
- extensions:
- - ".n"
- ace_mode: text
- language_id: 243
-NetLinx:
- type: programming
- color: "#0aa0ff"
- extensions:
- - ".axs"
- - ".axi"
- tm_scope: source.netlinx
- ace_mode: text
- language_id: 244
-NetLinx+ERB:
- type: programming
- color: "#747faa"
- extensions:
- - ".axs.erb"
- - ".axi.erb"
- tm_scope: source.netlinx.erb
- ace_mode: text
- language_id: 245
-NetLogo:
- type: programming
- color: "#ff6375"
- extensions:
- - ".nlogo"
- tm_scope: source.lisp
- ace_mode: lisp
- codemirror_mode: commonlisp
- codemirror_mime_type: text/x-common-lisp
- language_id: 246
-NewLisp:
- type: programming
- lexer: NewLisp
- color: "#87AED7"
- extensions:
- - ".nl"
- - ".lisp"
- - ".lsp"
- interpreters:
- - newlisp
- tm_scope: source.lisp
- ace_mode: lisp
- codemirror_mode: commonlisp
- codemirror_mime_type: text/x-common-lisp
- language_id: 247
-Nextflow:
- type: programming
- ace_mode: groovy
- tm_scope: source.nextflow
- color: "#3ac486"
- extensions:
- - ".nf"
- filenames:
- - nextflow.config
- interpreters:
- - nextflow
- language_id: 506780613
-Nginx:
- type: data
- extensions:
- - ".nginxconf"
- - ".vhost"
- filenames:
- - nginx.conf
- tm_scope: source.nginx
- aliases:
- - nginx configuration file
- ace_mode: text
- codemirror_mode: nginx
- codemirror_mime_type: text/x-nginx-conf
- language_id: 248
-Nim:
- type: programming
- color: "#37775b"
- extensions:
- - ".nim"
- - ".nimrod"
- ace_mode: text
- tm_scope: source.nim
- language_id: 249
-Ninja:
- type: data
- tm_scope: source.ninja
- extensions:
- - ".ninja"
- ace_mode: text
- language_id: 250
-Nit:
- type: programming
- color: "#009917"
- extensions:
- - ".nit"
- tm_scope: source.nit
- ace_mode: text
- language_id: 251
-Nix:
- type: programming
- color: "#7e7eff"
- extensions:
- - ".nix"
- aliases:
- - nixos
- tm_scope: source.nix
- ace_mode: nix
- language_id: 252
-Nu:
- type: programming
- color: "#c9df40"
- aliases:
- - nush
- extensions:
- - ".nu"
- filenames:
- - Nukefile
- tm_scope: source.nu
- ace_mode: scheme
- codemirror_mode: scheme
- codemirror_mime_type: text/x-scheme
- interpreters:
- - nush
- language_id: 253
-NumPy:
- type: programming
- group: Python
- extensions:
- - ".numpy"
- - ".numpyw"
- - ".numsc"
- tm_scope: none
- ace_mode: text
- codemirror_mode: python
- codemirror_mime_type: text/x-python
- language_id: 254
-OCaml:
- type: programming
- ace_mode: ocaml
- codemirror_mode: mllike
- codemirror_mime_type: text/x-ocaml
- color: "#3be133"
- extensions:
- - ".ml"
- - ".eliom"
- - ".eliomi"
- - ".ml4"
- - ".mli"
- - ".mll"
- - ".mly"
- interpreters:
- - ocaml
- - ocamlrun
- - ocamlscript
- tm_scope: source.ocaml
- language_id: 255
-ObjDump:
- type: data
- extensions:
- - ".objdump"
- tm_scope: objdump.x86asm
- ace_mode: assembly_x86
- language_id: 256
-Objective-C:
- type: programming
- tm_scope: source.objc
- color: "#438eff"
- aliases:
- - obj-c
- - objc
- - objectivec
- extensions:
- - ".m"
- - ".h"
- ace_mode: objectivec
- codemirror_mode: clike
- codemirror_mime_type: text/x-objectivec
- language_id: 257
-Objective-C++:
- type: programming
- tm_scope: source.objc++
- color: "#6866fb"
- aliases:
- - obj-c++
- - objc++
- - objectivec++
- extensions:
- - ".mm"
- ace_mode: objectivec
- codemirror_mode: clike
- codemirror_mime_type: text/x-objectivec
- language_id: 258
-Objective-J:
- type: programming
- color: "#ff0c5a"
- aliases:
- - obj-j
- - objectivej
- - objj
- extensions:
- - ".j"
- - ".sj"
- tm_scope: source.js.objj
- ace_mode: text
- language_id: 259
-Omgrofl:
- type: programming
- extensions:
- - ".omgrofl"
- color: "#cabbff"
- tm_scope: none
- ace_mode: text
- language_id: 260
-Opa:
- type: programming
- extensions:
- - ".opa"
- ace_mode: text
- language_id: 261
-Opal:
- type: programming
- color: "#f7ede0"
- extensions:
- - ".opal"
- tm_scope: source.opal
- ace_mode: text
- language_id: 262
-OpenCL:
- type: programming
- group: C
- extensions:
- - ".cl"
- - ".opencl"
- tm_scope: source.c
- ace_mode: c_cpp
- codemirror_mode: clike
- codemirror_mime_type: text/x-csrc
- language_id: 263
-OpenEdge ABL:
- type: programming
- aliases:
- - progress
- - openedge
- - abl
- extensions:
- - ".p"
- - ".cls"
- - ".w"
- tm_scope: source.abl
- ace_mode: text
- language_id: 264
-OpenRC runscript:
- type: programming
- group: Shell
- aliases:
- - openrc
- interpreters:
- - openrc-run
- tm_scope: source.shell
- ace_mode: sh
- codemirror_mode: shell
- codemirror_mime_type: text/x-sh
- language_id: 265
-OpenSCAD:
- type: programming
- extensions:
- - ".scad"
- tm_scope: source.scad
- ace_mode: scad
- language_id: 266
-OpenType Feature File:
- type: data
- aliases:
- - AFDKO
- extensions:
- - ".fea"
- tm_scope: source.opentype
- ace_mode: text
- language_id: 374317347
-Org:
- type: prose
- wrap: true
- extensions:
- - ".org"
- tm_scope: none
- ace_mode: text
- language_id: 267
-Ox:
- type: programming
- extensions:
- - ".ox"
- - ".oxh"
- - ".oxo"
- tm_scope: source.ox
- ace_mode: text
- language_id: 268
-Oxygene:
- type: programming
- color: "#cdd0e3"
- extensions:
- - ".oxygene"
- tm_scope: none
- ace_mode: text
- language_id: 269
-Oz:
- type: programming
- color: "#fab738"
- extensions:
- - ".oz"
- tm_scope: source.oz
- ace_mode: text
- codemirror_mode: oz
- codemirror_mime_type: text/x-oz
- language_id: 270
-P4:
- type: programming
- color: "#7055b5"
- extensions:
- - ".p4"
- tm_scope: source.p4
- ace_mode: text
- language_id: 348895984
-PAWN:
- type: programming
- color: "#dbb284"
- extensions:
- - ".pwn"
- - ".inc"
- tm_scope: source.pawn
- ace_mode: text
- language_id: 271
-PHP:
- type: programming
- tm_scope: text.html.php
- ace_mode: php
- codemirror_mode: php
- codemirror_mime_type: application/x-httpd-php
- color: "#4F5D95"
- extensions:
- - ".php"
- - ".aw"
- - ".ctp"
- - ".fcgi"
- - ".inc"
- - ".php3"
- - ".php4"
- - ".php5"
- - ".phps"
- - ".phpt"
- filenames:
- - ".php_cs"
- - ".php_cs.dist"
- - Phakefile
- interpreters:
- - php
- aliases:
- - inc
- language_id: 272
-PLSQL:
- type: programming
- ace_mode: sql
- codemirror_mode: sql
- codemirror_mime_type: text/x-plsql
- tm_scope: none
- color: "#dad8d8"
- extensions:
- - ".pls"
- - ".bdy"
- - ".ddl"
- - ".fnc"
- - ".pck"
- - ".pkb"
- - ".pks"
- - ".plb"
- - ".plsql"
- - ".prc"
- - ".spc"
- - ".sql"
- - ".tpb"
- - ".tps"
- - ".trg"
- - ".vw"
- language_id: 273
-PLpgSQL:
- type: programming
- ace_mode: pgsql
- codemirror_mode: sql
- codemirror_mime_type: text/x-sql
- tm_scope: source.sql
- extensions:
- - ".sql"
- language_id: 274
-POV-Ray SDL:
- type: programming
- aliases:
- - pov-ray
- - povray
- extensions:
- - ".pov"
- - ".inc"
- ace_mode: text
- language_id: 275
-Pan:
- type: programming
- color: "#cc0000"
- extensions:
- - ".pan"
- tm_scope: source.pan
- ace_mode: text
- language_id: 276
-Papyrus:
- type: programming
- color: "#6600cc"
- extensions:
- - ".psc"
- tm_scope: source.papyrus.skyrim
- ace_mode: text
- language_id: 277
-Parrot:
- type: programming
- color: "#f3ca0a"
- extensions:
- - ".parrot"
- tm_scope: none
- ace_mode: text
- language_id: 278
-Parrot Assembly:
- group: Parrot
- type: programming
- aliases:
- - pasm
- extensions:
- - ".pasm"
- interpreters:
- - parrot
- tm_scope: none
- ace_mode: text
- language_id: 279
-Parrot Internal Representation:
- group: Parrot
- tm_scope: source.parrot.pir
- type: programming
- aliases:
- - pir
- extensions:
- - ".pir"
- interpreters:
- - parrot
- ace_mode: text
- language_id: 280
-Pascal:
- type: programming
- color: "#E3F171"
- extensions:
- - ".pas"
- - ".dfm"
- - ".dpr"
- - ".inc"
- - ".lpr"
- - ".pascal"
- - ".pp"
- interpreters:
- - instantfpc
- ace_mode: pascal
- codemirror_mode: pascal
- codemirror_mime_type: text/x-pascal
- language_id: 281
-Pep8:
- type: programming
- color: "#C76F5B"
- extensions:
- - ".pep"
- ace_mode: text
- tm_scope: source.pep8
- language_id: 840372442
-Perl:
- type: programming
- tm_scope: source.perl
- ace_mode: perl
- codemirror_mode: perl
- codemirror_mime_type: text/x-perl
- color: "#0298c3"
- extensions:
- - ".pl"
- - ".al"
- - ".cgi"
- - ".fcgi"
- - ".perl"
- - ".ph"
- - ".plx"
- - ".pm"
- - ".psgi"
- - ".t"
- filenames:
- - cpanfile
- interpreters:
- - perl
- language_id: 282
-Perl 6:
- type: programming
- color: "#0000fb"
- extensions:
- - ".6pl"
- - ".6pm"
- - ".nqp"
- - ".p6"
- - ".p6l"
- - ".p6m"
- - ".pl"
- - ".pl6"
- - ".pm"
- - ".pm6"
- - ".t"
- filenames:
- - Rexfile
- interpreters:
- - perl6
- aliases:
- - perl6
- tm_scope: source.perl6fe
- ace_mode: perl
- codemirror_mode: perl
- codemirror_mime_type: text/x-perl
- language_id: 283
-Pic:
- type: markup
- group: Roff
- tm_scope: source.pic
- extensions:
- - ".pic"
- - ".chem"
- ace_mode: text
- codemirror_mode: troff
- codemirror_mime_type: text/troff
- language_id: 425
-Pickle:
- type: data
- extensions:
- - ".pkl"
- tm_scope: none
- ace_mode: text
- language_id: 284
-PicoLisp:
- type: programming
- extensions:
- - ".l"
- interpreters:
- - picolisp
- - pil
- tm_scope: source.lisp
- ace_mode: lisp
- language_id: 285
-PigLatin:
- type: programming
- color: "#fcd7de"
- extensions:
- - ".pig"
- tm_scope: source.pig_latin
- ace_mode: text
- language_id: 286
-Pike:
- type: programming
- color: "#005390"
- extensions:
- - ".pike"
- - ".pmod"
- interpreters:
- - pike
- ace_mode: text
- language_id: 287
-Pod:
- type: prose
- ace_mode: perl
- codemirror_mode: perl
- codemirror_mime_type: text/x-perl
- wrap: true
- extensions:
- - ".pod"
- interpreters:
- - perl
- tm_scope: none
- language_id: 288
-PogoScript:
- type: programming
- color: "#d80074"
- extensions:
- - ".pogo"
- tm_scope: source.pogoscript
- ace_mode: text
- language_id: 289
-Pony:
- type: programming
- extensions:
- - ".pony"
- tm_scope: source.pony
- ace_mode: text
- language_id: 290
-PostCSS:
- type: markup
- tm_scope: source.postcss
- group: CSS
- extensions:
- - ".pcss"
- ace_mode: text
- language_id: 262764437
-PostScript:
- type: markup
- color: "#da291c"
- extensions:
- - ".ps"
- - ".eps"
- - ".pfa"
- tm_scope: source.postscript
- aliases:
- - postscr
- ace_mode: text
- language_id: 291
-PowerBuilder:
- type: programming
- color: "#8f0f8d"
- extensions:
- - ".pbt"
- - ".sra"
- - ".sru"
- - ".srw"
- tm_scope: none
- ace_mode: text
- language_id: 292
-PowerShell:
- type: programming
- color: "#012456"
- ace_mode: powershell
- codemirror_mode: powershell
- codemirror_mime_type: application/x-powershell
- aliases:
- - posh
- extensions:
- - ".ps1"
- - ".psd1"
- - ".psm1"
- language_id: 293
-Processing:
- type: programming
- color: "#0096D8"
- extensions:
- - ".pde"
- ace_mode: text
- language_id: 294
-Prolog:
- type: programming
- color: "#74283c"
- extensions:
- - ".pl"
- - ".pro"
- - ".prolog"
- - ".yap"
- interpreters:
- - swipl
- - yap
- tm_scope: source.prolog
- ace_mode: prolog
- language_id: 295
-Propeller Spin:
- type: programming
- color: "#7fa2a7"
- extensions:
- - ".spin"
- tm_scope: source.spin
- ace_mode: text
- language_id: 296
-Protocol Buffer:
- type: data
- aliases:
- - protobuf
- - Protocol Buffers
- extensions:
- - ".proto"
- tm_scope: source.protobuf
- ace_mode: protobuf
- codemirror_mode: protobuf
- codemirror_mime_type: text/x-protobuf
- language_id: 297
-Public Key:
- type: data
- extensions:
- - ".asc"
- - ".pub"
- tm_scope: none
- ace_mode: text
- codemirror_mode: asciiarmor
- codemirror_mime_type: application/pgp
- language_id: 298
-Pug:
- group: HTML
- type: markup
- extensions:
- - ".jade"
- - ".pug"
- tm_scope: text.jade
- ace_mode: jade
- codemirror_mode: pug
- codemirror_mime_type: text/x-pug
- language_id: 179
-Puppet:
- type: programming
- color: "#302B6D"
- extensions:
- - ".pp"
- filenames:
- - Modulefile
- ace_mode: text
- codemirror_mode: puppet
- codemirror_mime_type: text/x-puppet
- tm_scope: source.puppet
- language_id: 299
-Pure Data:
- type: data
- extensions:
- - ".pd"
- tm_scope: none
- ace_mode: text
- language_id: 300
-PureBasic:
- type: programming
- color: "#5a6986"
- extensions:
- - ".pb"
- - ".pbi"
- tm_scope: none
- ace_mode: text
- language_id: 301
-PureScript:
- type: programming
- color: "#1D222D"
- extensions:
- - ".purs"
- tm_scope: source.purescript
- ace_mode: haskell
- codemirror_mode: haskell
- codemirror_mime_type: text/x-haskell
- language_id: 302
-Python:
- type: programming
- ace_mode: python
- codemirror_mode: python
- codemirror_mime_type: text/x-python
- color: "#3572A5"
- extensions:
- - ".py"
- - ".bzl"
- - ".cgi"
- - ".fcgi"
- - ".gyp"
- - ".gypi"
- - ".lmi"
- - ".py3"
- - ".pyde"
- - ".pyi"
- - ".pyp"
- - ".pyt"
- - ".pyw"
- - ".rpy"
- - ".spec"
- - ".tac"
- - ".wsgi"
- - ".xpy"
- filenames:
- - ".gclient"
- - BUCK
- - BUILD
- - BUILD.bazel
- - SConscript
- - SConstruct
- - Snakefile
- - WORKSPACE
- - wscript
- interpreters:
- - python
- - python2
- - python3
- aliases:
- - rusthon
- - python3
- language_id: 303
-Python console:
- type: programming
- group: Python
- searchable: false
- aliases:
- - pycon
- tm_scope: text.python.console
- ace_mode: text
- language_id: 428
-Python traceback:
- type: data
- group: Python
- searchable: false
- extensions:
- - ".pytb"
- tm_scope: text.python.traceback
- ace_mode: text
- language_id: 304
-QML:
- type: programming
- color: "#44a51c"
- extensions:
- - ".qml"
- - ".qbs"
- tm_scope: source.qml
- ace_mode: text
- language_id: 305
-QMake:
- type: programming
- extensions:
- - ".pro"
- - ".pri"
- interpreters:
- - qmake
- ace_mode: text
- language_id: 306
-R:
- type: programming
- color: "#198CE7"
- aliases:
- - R
- - Rscript
- - splus
- extensions:
- - ".r"
- - ".rd"
- - ".rsx"
- filenames:
- - ".Rprofile"
- interpreters:
- - Rscript
- ace_mode: r
- codemirror_mode: r
- codemirror_mime_type: text/x-rsrc
- language_id: 307
-RAML:
- type: markup
- ace_mode: yaml
- codemirror_mode: yaml
- codemirror_mime_type: text/x-yaml
- tm_scope: source.yaml
- color: "#77d9fb"
- extensions:
- - ".raml"
- language_id: 308
-RDoc:
- type: prose
- ace_mode: rdoc
- wrap: true
- extensions:
- - ".rdoc"
- tm_scope: text.rdoc
- language_id: 309
-REALbasic:
- type: programming
- extensions:
- - ".rbbas"
- - ".rbfrm"
- - ".rbmnu"
- - ".rbres"
- - ".rbtbar"
- - ".rbuistate"
- tm_scope: source.vbnet
- ace_mode: text
- language_id: 310
-REXX:
- type: programming
- aliases:
- - arexx
- extensions:
- - ".rexx"
- - ".pprx"
- - ".rex"
- interpreters:
- - regina
- - rexx
- tm_scope: source.rexx
- ace_mode: text
- language_id: 311
-RHTML:
- type: markup
- group: HTML
- extensions:
- - ".rhtml"
- tm_scope: text.html.erb
- aliases:
- - html+ruby
- ace_mode: rhtml
- codemirror_mode: htmlembedded
- codemirror_mime_type: application/x-erb
- language_id: 312
-RMarkdown:
- type: prose
- wrap: true
- ace_mode: markdown
- codemirror_mode: gfm
- codemirror_mime_type: text/x-gfm
- extensions:
- - ".rmd"
- tm_scope: source.gfm
- language_id: 313
-RPC:
- type: programming
- aliases:
- - rpcgen
- - oncrpc
- - xdr
- ace_mode: c_cpp
- extensions:
- - ".x"
- tm_scope: source.c
- language_id: 1031374237
-RPM Spec:
- type: data
- tm_scope: source.rpm-spec
- extensions:
- - ".spec"
- aliases:
- - specfile
- ace_mode: text
- codemirror_mode: rpm
- codemirror_mime_type: text/x-rpm-spec
- language_id: 314
-RUNOFF:
- type: markup
- color: "#665a4e"
- extensions:
- - ".rnh"
- - ".rno"
- tm_scope: text.runoff
- ace_mode: text
- language_id: 315
-Racket:
- type: programming
- color: "#22228f"
- extensions:
- - ".rkt"
- - ".rktd"
- - ".rktl"
- - ".scrbl"
- interpreters:
- - racket
- tm_scope: source.racket
- ace_mode: lisp
- language_id: 316
-Ragel:
- type: programming
- color: "#9d5200"
- extensions:
- - ".rl"
- aliases:
- - ragel-rb
- - ragel-ruby
- tm_scope: none
- ace_mode: text
- language_id: 317
-Rascal:
- type: programming
- color: "#fffaa0"
- extensions:
- - ".rsc"
- tm_scope: source.rascal
- ace_mode: text
- language_id: 173616037
-Raw token data:
- type: data
- aliases:
- - raw
- extensions:
- - ".raw"
- tm_scope: none
- ace_mode: text
- language_id: 318
-Reason:
- type: programming
- group: OCaml
- ace_mode: rust
- codemirror_mode: rust
- codemirror_mime_type: text/x-rustsrc
- extensions:
- - ".re"
- - ".rei"
- interpreters:
- - ocaml
- tm_scope: source.reason
- language_id: 869538413
-Rebol:
- type: programming
- color: "#358a5b"
- extensions:
- - ".reb"
- - ".r"
- - ".r2"
- - ".r3"
- - ".rebol"
- ace_mode: text
- tm_scope: source.rebol
- language_id: 319
-Red:
- type: programming
- color: "#f50000"
- extensions:
- - ".red"
- - ".reds"
- aliases:
- - red/system
- tm_scope: source.red
- ace_mode: text
- language_id: 320
-Redcode:
- type: programming
- extensions:
- - ".cw"
- tm_scope: none
- ace_mode: text
- language_id: 321
-Regular Expression:
- type: data
- extensions:
- - ".regexp"
- - ".regex"
- aliases:
- - regexp
- - regex
- ace_mode: text
- tm_scope: source.regexp
- language_id: 363378884
-Ren'Py:
- type: programming
- aliases:
- - renpy
- color: "#ff7f7f"
- extensions:
- - ".rpy"
- tm_scope: source.renpy
- ace_mode: python
- language_id: 322
-RenderScript:
- type: programming
- extensions:
- - ".rs"
- - ".rsh"
- tm_scope: none
- ace_mode: text
- language_id: 323
-Ring:
- type: programming
- color: "#0e60e3"
- extensions:
- - ".ring"
- tm_scope: source.ring
- ace_mode: text
- language_id: 431
-RobotFramework:
- type: programming
- extensions:
- - ".robot"
- tm_scope: text.robot
- ace_mode: text
- language_id: 324
-Roff:
- type: markup
- color: "#ecdebe"
- extensions:
- - ".man"
- - ".1"
- - ".1in"
- - ".1m"
- - ".1x"
- - ".2"
- - ".3"
- - ".3in"
- - ".3m"
- - ".3qt"
- - ".3x"
- - ".4"
- - ".5"
- - ".6"
- - ".7"
- - ".8"
- - ".9"
- - ".l"
- - ".me"
- - ".ms"
- - ".n"
- - ".nr"
- - ".rno"
- - ".roff"
- - ".tmac"
- filenames:
- - mmn
- - mmt
- tm_scope: text.roff
- aliases:
- - nroff
- ace_mode: text
- codemirror_mode: troff
- codemirror_mime_type: text/troff
- language_id: 141
-Rouge:
- type: programming
- ace_mode: clojure
- codemirror_mode: clojure
- codemirror_mime_type: text/x-clojure
- color: "#cc0088"
- extensions:
- - ".rg"
- tm_scope: source.clojure
- language_id: 325
-Ruby:
- type: programming
- ace_mode: ruby
- codemirror_mode: ruby
- codemirror_mime_type: text/x-ruby
- color: "#701516"
- aliases:
- - jruby
- - macruby
- - rake
- - rb
- - rbx
- extensions:
- - ".rb"
- - ".builder"
- - ".eye"
- - ".fcgi"
- - ".gemspec"
- - ".god"
- - ".jbuilder"
- - ".mspec"
- - ".pluginspec"
- - ".podspec"
- - ".rabl"
- - ".rake"
- - ".rbuild"
- - ".rbw"
- - ".rbx"
- - ".ru"
- - ".ruby"
- - ".spec"
- - ".thor"
- - ".watchr"
- interpreters:
- - ruby
- - macruby
- - rake
- - jruby
- - rbx
- filenames:
- - ".irbrc"
- - ".pryrc"
- - Appraisals
- - Berksfile
- - Brewfile
- - Buildfile
- - Dangerfile
- - Deliverfile
- - Fastfile
- - Gemfile
- - Gemfile.lock
- - Guardfile
- - Jarfile
- - Mavenfile
- - Podfile
- - Puppetfile
- - Rakefile
- - Snapfile
- - Thorfile
- - Vagrantfile
- - buildfile
- language_id: 326
-Rust:
- type: programming
- color: "#dea584"
- extensions:
- - ".rs"
- - ".rs.in"
- ace_mode: rust
- codemirror_mode: rust
- codemirror_mime_type: text/x-rustsrc
- language_id: 327
-SAS:
- type: programming
- color: "#B34936"
- extensions:
- - ".sas"
- tm_scope: source.sas
- ace_mode: text
- codemirror_mode: sas
- codemirror_mime_type: text/x-sas
- language_id: 328
-SCSS:
- type: markup
- tm_scope: source.scss
- group: CSS
- ace_mode: scss
- codemirror_mode: css
- codemirror_mime_type: text/x-scss
- extensions:
- - ".scss"
- language_id: 329
-SMT:
- type: programming
- extensions:
- - ".smt2"
- - ".smt"
- interpreters:
- - boolector
- - cvc4
- - mathsat5
- - opensmt
- - smtinterpol
- - smt-rat
- - stp
- - verit
- - yices2
- - z3
- tm_scope: source.smt
- ace_mode: text
- language_id: 330
-SPARQL:
- type: data
- tm_scope: source.sparql
- ace_mode: text
- codemirror_mode: sparql
- codemirror_mime_type: application/sparql-query
- extensions:
- - ".sparql"
- - ".rq"
- language_id: 331
-SQF:
- type: programming
- color: "#3F3F3F"
- extensions:
- - ".sqf"
- - ".hqf"
- tm_scope: source.sqf
- ace_mode: text
- language_id: 332
-SQL:
- type: data
- tm_scope: source.sql
- ace_mode: sql
- codemirror_mode: sql
- codemirror_mime_type: text/x-sql
- extensions:
- - ".sql"
- - ".cql"
- - ".ddl"
- - ".inc"
- - ".mysql"
- - ".prc"
- - ".tab"
- - ".udf"
- - ".viw"
- language_id: 333
-SQLPL:
- type: programming
- ace_mode: sql
- codemirror_mode: sql
- codemirror_mime_type: text/x-sql
- tm_scope: source.sql
- extensions:
- - ".sql"
- - ".db2"
- language_id: 334
-SRecode Template:
- type: markup
- color: "#348a34"
- tm_scope: source.lisp
- ace_mode: lisp
- codemirror_mode: commonlisp
- codemirror_mime_type: text/x-common-lisp
- extensions:
- - ".srt"
- language_id: 335
-STON:
- type: data
- group: Smalltalk
- extensions:
- - ".ston"
- tm_scope: source.smalltalk
- ace_mode: text
- language_id: 336
-SVG:
- type: data
- extensions:
- - ".svg"
- tm_scope: text.xml
- ace_mode: xml
- codemirror_mode: xml
- codemirror_mime_type: text/xml
- language_id: 337
-Sage:
- type: programming
- group: Python
- extensions:
- - ".sage"
- - ".sagews"
- tm_scope: source.python
- ace_mode: python
- codemirror_mode: python
- codemirror_mime_type: text/x-python
- language_id: 338
-SaltStack:
- type: programming
- color: "#646464"
- aliases:
- - saltstate
- - salt
- extensions:
- - ".sls"
- tm_scope: source.yaml.salt
- ace_mode: yaml
- codemirror_mode: yaml
- codemirror_mime_type: text/x-yaml
- language_id: 339
-Sass:
- type: markup
- tm_scope: source.sass
- group: CSS
- extensions:
- - ".sass"
- ace_mode: sass
- codemirror_mode: sass
- codemirror_mime_type: text/x-sass
- language_id: 340
-Scala:
- type: programming
- ace_mode: scala
- codemirror_mode: clike
- codemirror_mime_type: text/x-scala
- color: "#c22d40"
- extensions:
- - ".scala"
- - ".kojo"
- - ".sbt"
- - ".sc"
- interpreters:
- - scala
- language_id: 341
-Scaml:
- group: HTML
- type: markup
- extensions:
- - ".scaml"
- tm_scope: source.scaml
- ace_mode: text
- language_id: 342
-Scheme:
- type: programming
- color: "#1e4aec"
- extensions:
- - ".scm"
- - ".sch"
- - ".sld"
- - ".sls"
- - ".sps"
- - ".ss"
- interpreters:
- - guile
- - bigloo
- - chicken
- - csi
- - gosh
- - r6rs
- ace_mode: scheme
- codemirror_mode: scheme
- codemirror_mime_type: text/x-scheme
- language_id: 343
-Scilab:
- type: programming
- extensions:
- - ".sci"
- - ".sce"
- - ".tst"
- ace_mode: text
- language_id: 344
-Self:
- type: programming
- color: "#0579aa"
- extensions:
- - ".self"
- tm_scope: none
- ace_mode: text
- language_id: 345
-ShaderLab:
- type: programming
- extensions:
- - ".shader"
- ace_mode: text
- tm_scope: source.shaderlab
- language_id: 664257356
-Shell:
- type: programming
- color: "#89e051"
- aliases:
- - sh
- - shell-script
- - bash
- - zsh
- extensions:
- - ".sh"
- - ".bash"
- - ".bats"
- - ".cgi"
- - ".command"
- - ".fcgi"
- - ".ksh"
- - ".sh.in"
- - ".tmux"
- - ".tool"
- - ".zsh"
- filenames:
- - ".bash_history"
- - ".bash_logout"
- - ".bash_profile"
- - ".bashrc"
- - PKGBUILD
- - gradlew
- interpreters:
- - ash
- - bash
- - dash
- - ksh
- - mksh
- - pdksh
- - rc
- - sh
- - zsh
- ace_mode: sh
- codemirror_mode: shell
- codemirror_mime_type: text/x-sh
- language_id: 346
-ShellSession:
- type: programming
- extensions:
- - ".sh-session"
- aliases:
- - bash session
- - console
- tm_scope: text.shell-session
- ace_mode: sh
- codemirror_mode: shell
- codemirror_mime_type: text/x-sh
- language_id: 347
-Shen:
- type: programming
- color: "#120F14"
- extensions:
- - ".shen"
- tm_scope: source.shen
- ace_mode: text
- language_id: 348
-Slash:
- type: programming
- color: "#007eff"
- extensions:
- - ".sl"
- tm_scope: text.html.slash
- ace_mode: text
- language_id: 349
-Slim:
- group: HTML
- type: markup
- extensions:
- - ".slim"
- tm_scope: text.slim
- ace_mode: text
- codemirror_mode: slim
- codemirror_mime_type: text/x-slim
- language_id: 350
-Smali:
- type: programming
- extensions:
- - ".smali"
- ace_mode: text
- tm_scope: source.smali
- language_id: 351
-Smalltalk:
- type: programming
- color: "#596706"
- extensions:
- - ".st"
- - ".cs"
- aliases:
- - squeak
- ace_mode: text
- codemirror_mode: smalltalk
- codemirror_mime_type: text/x-stsrc
- language_id: 352
-Smarty:
- type: programming
- extensions:
- - ".tpl"
- ace_mode: smarty
- codemirror_mode: smarty
- codemirror_mime_type: text/x-smarty
- tm_scope: text.html.smarty
- language_id: 353
-Solidity:
- type: programming
- color: "#AA6746"
- ace_mode: text
- tm_scope: source.solidity
- language_id: 237469032
-SourcePawn:
- type: programming
- color: "#5c7611"
- aliases:
- - sourcemod
- extensions:
- - ".sp"
- - ".inc"
- - ".sma"
- tm_scope: source.sp
- ace_mode: text
- language_id: 354
-Spline Font Database:
- type: data
- extensions:
- - ".sfd"
- tm_scope: text.sfd
- ace_mode: yaml
- language_id: 767169629
-Squirrel:
- type: programming
- color: "#800000"
- extensions:
- - ".nut"
- tm_scope: source.c++
- ace_mode: c_cpp
- codemirror_mode: clike
- codemirror_mime_type: text/x-c++src
- language_id: 355
-Stan:
- type: programming
- color: "#b2011d"
- extensions:
- - ".stan"
- ace_mode: text
- tm_scope: source.stan
- language_id: 356
-Standard ML:
- type: programming
- color: "#dc566d"
- aliases:
- - sml
- extensions:
- - ".ML"
- - ".fun"
- - ".sig"
- - ".sml"
- tm_scope: source.ml
- ace_mode: text
- codemirror_mode: mllike
- codemirror_mime_type: text/x-ocaml
- language_id: 357
-Stata:
- type: programming
- extensions:
- - ".do"
- - ".ado"
- - ".doh"
- - ".ihlp"
- - ".mata"
- - ".matah"
- - ".sthlp"
- ace_mode: text
- language_id: 358
-Stylus:
- type: markup
- group: CSS
- extensions:
- - ".styl"
- tm_scope: source.stylus
- ace_mode: stylus
- language_id: 359
-SubRip Text:
- type: data
- extensions:
- - ".srt"
- ace_mode: text
- tm_scope: text.srt
- language_id: 360
-Sublime Text Config:
- type: data
- group: JSON
- tm_scope: source.js
- ace_mode: javascript
- codemirror_mode: javascript
- codemirror_mime_type: text/javascript
- extensions:
- - ".sublime-build"
- - ".sublime-commands"
- - ".sublime-completions"
- - ".sublime-keymap"
- - ".sublime-macro"
- - ".sublime-menu"
- - ".sublime-mousemap"
- - ".sublime-project"
- - ".sublime-settings"
- - ".sublime-theme"
- - ".sublime-workspace"
- - ".sublime_metrics"
- - ".sublime_session"
- language_id: 423
-SugarSS:
- type: markup
- tm_scope: source.css.postcss.sugarss
- group: CSS
- extensions:
- - ".sss"
- ace_mode: text
- language_id: 826404698
-SuperCollider:
- type: programming
- color: "#46390b"
- extensions:
- - ".sc"
- - ".scd"
- interpreters:
- - sclang
- - scsynth
- tm_scope: source.supercollider
- ace_mode: text
- language_id: 361
-Swift:
- type: programming
- color: "#ffac45"
- extensions:
- - ".swift"
- ace_mode: text
- codemirror_mode: swift
- codemirror_mime_type: text/x-swift
- language_id: 362
-SystemVerilog:
- type: programming
- color: "#DAE1C2"
- extensions:
- - ".sv"
- - ".svh"
- - ".vh"
- ace_mode: verilog
- codemirror_mode: verilog
- codemirror_mime_type: text/x-systemverilog
- language_id: 363
-TI Program:
- type: programming
- ace_mode: text
- color: "#A0AA87"
- extensions:
- - ".8xp"
- - ".8xk"
- - ".8xk.txt"
- - ".8xp.txt"
- language_id: 422
- tm_scope: none
-TLA:
- type: programming
- extensions:
- - ".tla"
- tm_scope: source.tla
- ace_mode: text
- language_id: 364
-TOML:
- type: data
- extensions:
- - ".toml"
- tm_scope: source.toml
- ace_mode: toml
- codemirror_mode: toml
- codemirror_mime_type: text/x-toml
- language_id: 365
-TXL:
- type: programming
- extensions:
- - ".txl"
- tm_scope: source.txl
- ace_mode: text
- language_id: 366
-Tcl:
- type: programming
- color: "#e4cc98"
- extensions:
- - ".tcl"
- - ".adp"
- - ".tm"
- interpreters:
- - tclsh
- - wish
- ace_mode: tcl
- codemirror_mode: tcl
- codemirror_mime_type: text/x-tcl
- language_id: 367
-Tcsh:
- type: programming
- group: Shell
- extensions:
- - ".tcsh"
- - ".csh"
- tm_scope: source.shell
- ace_mode: sh
- codemirror_mode: shell
- codemirror_mime_type: text/x-sh
- language_id: 368
-TeX:
- type: markup
- color: "#3D6117"
- ace_mode: tex
- codemirror_mode: stex
- codemirror_mime_type: text/x-stex
- wrap: true
- aliases:
- - latex
- extensions:
- - ".tex"
- - ".aux"
- - ".bbx"
- - ".bib"
- - ".cbx"
- - ".cls"
- - ".dtx"
- - ".ins"
- - ".lbx"
- - ".ltx"
- - ".mkii"
- - ".mkiv"
- - ".mkvi"
- - ".sty"
- - ".toc"
- language_id: 369
-Tea:
- type: markup
- extensions:
- - ".tea"
- tm_scope: source.tea
- ace_mode: text
- language_id: 370
-Terra:
- type: programming
- extensions:
- - ".t"
- color: "#00004c"
- ace_mode: lua
- codemirror_mode: lua
- codemirror_mime_type: text/x-lua
- interpreters:
- - lua
- language_id: 371
-Text:
- type: prose
- wrap: true
- aliases:
- - fundamental
- extensions:
- - ".txt"
- - ".fr"
- - ".nb"
- - ".ncl"
- - ".no"
- filenames:
- - COPYING
- - COPYRIGHT.regex
- - FONTLOG
- - INSTALL
- - INSTALL.mysql
- - LICENSE
- - LICENSE.mysql
- - NEWS
- - README.1ST
- - README.me
- - README.mysql
- - click.me
- - delete.me
- - keep.me
- - read.me
- - test.me
- tm_scope: none
- ace_mode: text
- language_id: 372
-Textile:
- type: prose
- ace_mode: textile
- codemirror_mode: textile
- codemirror_mime_type: text/x-textile
- wrap: true
- extensions:
- - ".textile"
- tm_scope: none
- language_id: 373
-Thrift:
- type: programming
- tm_scope: source.thrift
- extensions:
- - ".thrift"
- ace_mode: text
- language_id: 374
-Turing:
- type: programming
- color: "#cf142b"
- extensions:
- - ".t"
- - ".tu"
- tm_scope: source.turing
- ace_mode: text
- language_id: 375
-Turtle:
- type: data
- extensions:
- - ".ttl"
- tm_scope: source.turtle
- ace_mode: text
- codemirror_mode: turtle
- codemirror_mime_type: text/turtle
- language_id: 376
-Twig:
- type: markup
- group: HTML
- extensions:
- - ".twig"
- tm_scope: text.html.twig
- ace_mode: twig
- codemirror_mode: twig
- codemirror_mime_type: text/x-twig
- language_id: 377
-Type Language:
- type: data
- aliases:
- - tl
- extensions:
- - ".tl"
- tm_scope: source.tl
- ace_mode: text
- language_id: 632765617
-TypeScript:
- type: programming
- color: "#2b7489"
- aliases:
- - ts
- extensions:
- - ".ts"
- - ".tsx"
- tm_scope: source.ts
- ace_mode: typescript
- codemirror_mode: javascript
- codemirror_mime_type: application/typescript
- language_id: 378
-Unified Parallel C:
- type: programming
- group: C
- ace_mode: c_cpp
- codemirror_mode: clike
- codemirror_mime_type: text/x-csrc
- extensions:
- - ".upc"
- tm_scope: source.c
- language_id: 379
-Unity3D Asset:
- type: data
- ace_mode: yaml
- codemirror_mode: yaml
- codemirror_mime_type: text/x-yaml
- extensions:
- - ".anim"
- - ".asset"
- - ".mat"
- - ".meta"
- - ".prefab"
- - ".unity"
- tm_scope: source.yaml
- language_id: 380
-Unix Assembly:
- type: programming
- group: Assembly
- extensions:
- - ".s"
- - ".ms"
- tm_scope: source.assembly
- ace_mode: assembly_x86
- language_id: 120
-Uno:
- type: programming
- extensions:
- - ".uno"
- ace_mode: csharp
- codemirror_mode: clike
- codemirror_mime_type: text/x-csharp
- tm_scope: source.cs
- language_id: 381
-UnrealScript:
- type: programming
- color: "#a54c4d"
- extensions:
- - ".uc"
- tm_scope: source.java
- ace_mode: java
- codemirror_mode: clike
- codemirror_mime_type: text/x-java
- language_id: 382
-UrWeb:
- type: programming
- aliases:
- - Ur/Web
- - Ur
- extensions:
- - ".ur"
- - ".urs"
- tm_scope: source.ur
- ace_mode: text
- language_id: 383
-VCL:
- type: programming
- color: "#0298c3"
- extensions:
- - ".vcl"
- tm_scope: source.varnish.vcl
- ace_mode: text
- language_id: 384
-VHDL:
- type: programming
- color: "#adb2cb"
- extensions:
- - ".vhdl"
- - ".vhd"
- - ".vhf"
- - ".vhi"
- - ".vho"
- - ".vhs"
- - ".vht"
- - ".vhw"
- ace_mode: vhdl
- codemirror_mode: vhdl
- codemirror_mime_type: text/x-vhdl
- language_id: 385
-Vala:
- type: programming
- color: "#fbe5cd"
- extensions:
- - ".vala"
- - ".vapi"
- ace_mode: vala
- language_id: 386
-Verilog:
- type: programming
- color: "#b2b7f8"
- extensions:
- - ".v"
- - ".veo"
- ace_mode: verilog
- codemirror_mode: verilog
- codemirror_mime_type: text/x-verilog
- language_id: 387
-Vim script:
- type: programming
- color: "#199f4b"
- tm_scope: source.viml
- aliases:
- - vim
- - viml
- - nvim
- extensions:
- - ".vim"
- filenames:
- - ".nvimrc"
- - ".vimrc"
- - _vimrc
- - gvimrc
- - nvimrc
- - vimrc
- ace_mode: text
- language_id: 388
-Visual Basic:
- type: programming
- color: "#945db7"
- extensions:
- - ".vb"
- - ".bas"
- - ".cls"
- - ".frm"
- - ".frx"
- - ".vba"
- - ".vbhtml"
- - ".vbs"
- tm_scope: source.vbnet
- aliases:
- - vb.net
- - vbnet
- ace_mode: text
- codemirror_mode: vb
- codemirror_mime_type: text/x-vb
- language_id: 389
-Volt:
- type: programming
- color: "#1F1F1F"
- extensions:
- - ".volt"
- tm_scope: source.d
- ace_mode: d
- codemirror_mode: d
- codemirror_mime_type: text/x-d
- language_id: 390
-Vue:
- type: markup
- color: "#2c3e50"
- extensions:
- - ".vue"
- tm_scope: text.html.vue
- ace_mode: html
- language_id: 391
-Wavefront Material:
- type: data
- extensions:
- - ".mtl"
- tm_scope: source.wavefront.mtl
- ace_mode: text
- language_id: 392
-Wavefront Object:
- type: data
- extensions:
- - ".obj"
- tm_scope: source.wavefront.obj
- ace_mode: text
- language_id: 393
-Web Ontology Language:
- type: data
- extensions:
- - ".owl"
- tm_scope: text.xml
- ace_mode: xml
- language_id: 394
-WebAssembly:
- type: programming
- color: "#04133b"
- extensions:
- - ".wast"
- - ".wat"
- aliases:
- - wast
- - wasm
- tm_scope: source.webassembly
- ace_mode: lisp
- codemirror_mode: commonlisp
- codemirror_mime_type: text/x-common-lisp
- language_id: 956556503
-WebIDL:
- type: programming
- extensions:
- - ".webidl"
- tm_scope: source.webidl
- ace_mode: text
- codemirror_mode: webidl
- codemirror_mime_type: text/x-webidl
- language_id: 395
-World of Warcraft Addon Data:
- type: data
- extensions:
- - ".toc"
- tm_scope: source.toc
- ace_mode: text
- language_id: 396
-X10:
- type: programming
- aliases:
- - xten
- ace_mode: text
- extensions:
- - ".x10"
- color: "#4B6BEF"
- tm_scope: source.x10
- language_id: 397
-XC:
- type: programming
- color: "#99DA07"
- extensions:
- - ".xc"
- tm_scope: source.xc
- ace_mode: c_cpp
- codemirror_mode: clike
- codemirror_mime_type: text/x-csrc
- language_id: 398
-XCompose:
- type: data
- filenames:
- - ".XCompose"
- - XCompose
- - xcompose
- tm_scope: config.xcompose
- ace_mode: text
- language_id: 225167241
-XML:
- type: data
- ace_mode: xml
- codemirror_mode: xml
- codemirror_mime_type: text/xml
- aliases:
- - rss
- - xsd
- - wsdl
- extensions:
- - ".xml"
- - ".adml"
- - ".admx"
- - ".ant"
- - ".axml"
- - ".builds"
- - ".ccproj"
- - ".ccxml"
- - ".clixml"
- - ".cproject"
- - ".cscfg"
- - ".csdef"
- - ".csl"
- - ".csproj"
- - ".ct"
- - ".depproj"
- - ".dita"
- - ".ditamap"
- - ".ditaval"
- - ".dll.config"
- - ".dotsettings"
- - ".filters"
- - ".fsproj"
- - ".fxml"
- - ".glade"
- - ".gml"
- - ".grxml"
- - ".iml"
- - ".ivy"
- - ".jelly"
- - ".jsproj"
- - ".kml"
- - ".launch"
- - ".mdpolicy"
- - ".mjml"
- - ".mm"
- - ".mod"
- - ".mxml"
- - ".natvis"
- - ".ndproj"
- - ".nproj"
- - ".nuspec"
- - ".odd"
- - ".osm"
- - ".pkgproj"
- - ".plist"
- - ".pluginspec"
- - ".proj"
- - ".props"
- - ".ps1xml"
- - ".psc1"
- - ".pt"
- - ".rdf"
- - ".resx"
- - ".rss"
- - ".sch"
- - ".scxml"
- - ".sfproj"
- - ".shproj"
- - ".srdf"
- - ".storyboard"
- - ".stTheme"
- - ".sublime-snippet"
- - ".targets"
- - ".tmCommand"
- - ".tml"
- - ".tmLanguage"
- - ".tmPreferences"
- - ".tmSnippet"
- - ".tmTheme"
- - ".ts"
- - ".tsx"
- - ".ui"
- - ".urdf"
- - ".ux"
- - ".vbproj"
- - ".vcxproj"
- - ".vsixmanifest"
- - ".vssettings"
- - ".vstemplate"
- - ".vxml"
- - ".wixproj"
- - ".wsdl"
- - ".wsf"
- - ".wxi"
- - ".wxl"
- - ".wxs"
- - ".x3d"
- - ".xacro"
- - ".xaml"
- - ".xib"
- - ".xlf"
- - ".xliff"
- - ".xmi"
- - ".xml.dist"
- - ".xproj"
- - ".xsd"
- - ".xspec"
- - ".xul"
- - ".zcml"
- filenames:
- - ".classpath"
- - ".project"
- - App.config
- - NuGet.config
- - Settings.StyleCop
- - Web.Debug.config
- - Web.Release.config
- - Web.config
- - packages.config
- language_id: 399
-XPM:
- type: data
- extensions:
- - ".xpm"
- - ".pm"
- ace_mode: c_cpp
- tm_scope: source.c
- language_id: 781846279
-XPages:
- type: data
- extensions:
- - ".xsp-config"
- - ".xsp.metadata"
- tm_scope: text.xml
- ace_mode: xml
- codemirror_mode: xml
- codemirror_mime_type: text/xml
- language_id: 400
-XProc:
- type: programming
- extensions:
- - ".xpl"
- - ".xproc"
- tm_scope: text.xml
- ace_mode: xml
- codemirror_mode: xml
- codemirror_mime_type: text/xml
- language_id: 401
-XQuery:
- type: programming
- color: "#5232e7"
- extensions:
- - ".xquery"
- - ".xq"
- - ".xql"
- - ".xqm"
- - ".xqy"
- ace_mode: xquery
- codemirror_mode: xquery
- codemirror_mime_type: application/xquery
- tm_scope: source.xq
- language_id: 402
-XS:
- type: programming
- extensions:
- - ".xs"
- tm_scope: source.c
- ace_mode: c_cpp
- codemirror_mode: clike
- codemirror_mime_type: text/x-csrc
- language_id: 403
-XSLT:
- type: programming
- aliases:
- - xsl
- extensions:
- - ".xslt"
- - ".xsl"
- tm_scope: text.xml.xsl
- ace_mode: xml
- codemirror_mode: xml
- codemirror_mime_type: text/xml
- color: "#EB8CEB"
- language_id: 404
-Xojo:
- type: programming
- extensions:
- - ".xojo_code"
- - ".xojo_menu"
- - ".xojo_report"
- - ".xojo_script"
- - ".xojo_toolbar"
- - ".xojo_window"
- tm_scope: source.vbnet
- ace_mode: text
- language_id: 405
-Xtend:
- type: programming
- extensions:
- - ".xtend"
- ace_mode: text
- language_id: 406
-YAML:
- type: data
- tm_scope: source.yaml
- aliases:
- - yml
- extensions:
- - ".yml"
- - ".reek"
- - ".rviz"
- - ".sublime-syntax"
- - ".syntax"
- - ".yaml"
- - ".yaml-tmlanguage"
- - ".yml.mysql"
- filenames:
- - ".clang-format"
- - ".clang-tidy"
- ace_mode: yaml
- codemirror_mode: yaml
- codemirror_mime_type: text/x-yaml
- language_id: 407
-YANG:
- type: data
- extensions:
- - ".yang"
- tm_scope: source.yang
- ace_mode: text
- language_id: 408
-YARA:
- type: data
- ace_mode: text
- extensions:
- - ".yar"
- - ".yara"
- tm_scope: source.yara
- language_id: 805122868
-Yacc:
- type: programming
- extensions:
- - ".y"
- - ".yacc"
- - ".yy"
- tm_scope: source.bison
- ace_mode: text
- color: "#4B6C4B"
- language_id: 409
-Zephir:
- type: programming
- color: "#118f9e"
- extensions:
- - ".zep"
- tm_scope: source.php.zephir
- ace_mode: php
- language_id: 410
-Zimpl:
- type: programming
- extensions:
- - ".zimpl"
- - ".zmpl"
- - ".zpl"
- tm_scope: none
- ace_mode: text
- language_id: 411
-desktop:
- type: data
- extensions:
- - ".desktop"
- - ".desktop.in"
- tm_scope: source.desktop
- ace_mode: text
- language_id: 412
-eC:
- type: programming
- color: "#913960"
- extensions:
- - ".ec"
- - ".eh"
- tm_scope: source.c.ec
- ace_mode: text
- language_id: 413
-edn:
- type: data
- ace_mode: clojure
- codemirror_mode: clojure
- codemirror_mime_type: text/x-clojure
- extensions:
- - ".edn"
- tm_scope: source.clojure
- language_id: 414
-fish:
- type: programming
- group: Shell
- interpreters:
- - fish
- extensions:
- - ".fish"
- tm_scope: source.fish
- ace_mode: text
- language_id: 415
-mupad:
- type: programming
- extensions:
- - ".mu"
- ace_mode: text
- language_id: 416
-nesC:
- type: programming
- color: "#94B0C7"
- extensions:
- - ".nc"
- ace_mode: text
- tm_scope: source.nesc
- language_id: 417
-ooc:
- type: programming
- color: "#b0b77e"
- extensions:
- - ".ooc"
- ace_mode: text
- language_id: 418
-reStructuredText:
- type: prose
- wrap: true
- aliases:
- - rst
- extensions:
- - ".rst"
- - ".rest"
- - ".rest.txt"
- - ".rst.txt"
- ace_mode: text
- codemirror_mode: rst
- codemirror_mime_type: text/x-rst
- language_id: 419
-wdl:
- type: programming
- color: "#42f1f4"
- extensions:
- - ".wdl"
- tm_scope: source.wdl
- ace_mode: text
- language_id: 374521672
-wisp:
- type: programming
- ace_mode: clojure
- codemirror_mode: clojure
- codemirror_mime_type: text/x-clojure
- color: "#7582D1"
- extensions:
- - ".wisp"
- tm_scope: source.clojure
- language_id: 420
-xBase:
- type: programming
- color: "#403a40"
- aliases:
- - advpl
- - clipper
- - foxpro
- extensions:
- - ".prg"
- - ".ch"
- - ".prw"
- tm_scope: source.harbour
- ace_mode: text
- language_id: 421
diff --git a/swh/langdetect/static_data/model.h5 b/swh/langdetect/static_data/model.h5
index 513da70..948df95 100644
Binary files a/swh/langdetect/static_data/model.h5 and b/swh/langdetect/static_data/model.h5 differ