Thursday, July 11, 2019

Inspec train plugin with Openshift

If you are in the IT industry, I hope you have heard about chef automation framework. which we commonly used for deployment automation. So In this blog post, I'm going to talk about the Inspec Train plugin that I have written for Openshift platform. Inspec is released by the Chef Developers. which is an automated testing tool for integration, compliance, security, and other policy requirements. but (When I'm writing this blog post) there wasn't an inspec train transport to communicate with the Openshift platform, because we can't use ssh to connect with the Openshift nodes. OC client needs to be used to connect with the openshift pods. So I thought of writing a train plugin for Openshift.

Openshift train plugin 

Custom openshift train plugin supports to execute inspec test cases in openshift pods. The source code of the openshift train plugin can be found [1]. To build the openshift train plugin, Ruby gems should be installed in your operating system. After installing the train plugin, openshift client distribution should be set up in the file system. Openshift client distribution can be downloaded from here [2]

Configuring openshift-origin-client-tools 

TOKEN=VtEZif0g4N6SPB56__rxcw5jEMMMB0eYI5yZMHFbqI (A valid token) 

./oc login https://console.org-env-0.org.innovateuk.ukri.org --token=$TOKEN --insecure-skip-tls-verify=true

After running the aforesaid command successfully, A file will be created as ~/.kube/config in the machine's home directory.

✋Important
If someone fails to perform the above steps, the following error would encounter in test runs,

[2019-05-13T19:43:15.223504 #6828] DEBUG -- : [Openshift] Parameter erroutput error: Missing or incomplete configuration info. Please login or point to an existing, complete config file: 1. Via the command-line flag --config 2. Via the KUBECONFIG environment variable 3. In your home directory as ~/.kube/config To view or setup config directly use the 'config' command.

Openshift Properties YAML file

Properties file path can be defined by using this environment variable name “OPENSHIFT_CRED_FILE“ and also if it is not defined, by default train plugin will search for the “openshift-properties.yml “ in the test command execution folder path.

ocPath: /home/madhawa/test/openshift-origin/openshift-origin-client-tools-v3.6.0-alpha.2-3c221d5-linux-64bit
serverUrl: https://console.xxxxxxxxxxx:443
token: DfJ_V1eSxz8gtK8rRGWBiqKUczvuuke_-o8vSlDtPhs
project: project-test


OC Path : Openshift client path
Server URL : Openshift login url
Token : Openshift login token
Project : Openshift project name


Instructions to install the plugin :
Prerequisite: inspec and ruby should be installed in the os.
1. Build the plugin: gem build train-openshift.gemspec
2. Install the plugin : inspec plugin install train-openshift-0.0.1.gem


Obtaining pod-name (Where tests need to run on)

#Select project
./oc project
./oc project project-test 

#List pod names 
./oc get pods

 After running the above commands tests can be executed with the defined pod in decided openshift project in openshift-properties.yml and provide the pod id in test run command. (as shown below).

inspec exec #{testfile} -t openshift://#{pod} 


Execute the test cases with the properties file 


1. copy this file openshift-properties.yml 
2. Run the inspec tests by following command 
inspec exec test_xpath.rb -t openshift://project-esb-deployment-2-mz5jz inspec exec #{testfile} -t openshift://#{pod}


 Execute the test cases with environment variables 


1. Execute the inspec exec command with variable names PROJECT="project-dev" POD="project-esb-deployment-2-mz5jz" OC_PATH="/home/madhawa/xiges/openshift-origin/openshift-origin-client-tools-v3.6.0-al pha.2-3c221d5-linux-64bit" SERVER="https://console.org-env-0.org.innovateuk.ukri.org:443" TOKEN="D4BYR3zg9GqK16hUGfwQq7NzYmlfhyPv0vswObSEtjU" inspec exec test_xpath.rb -t openshift://project-esb-deployment-2-mz5jz --attrs ../profile-attribute.yml 

use -l debug to enable debug logs

Happy testing with inspec and Openshift

[1] https://github.com/madhawa-gunasekara/train-openshift
[2] https://github.com/openshift/origin/releases

Thursday, November 20, 2014

NLP Categorizer

Most of the time, people need to categorize the documents according to their context. It is very useful when people work with very large number of documents.
Therefore It is very easy to make NLP categorizer for above purpose. There are several algorithms  used categorize these documents. most of them are semi supervised algorithms, like maximum entropy, naive bayes and maximum entropy markov models. Today I'm going describe how to categorize documents using apache openNLP toolkit. Apache openNLP supports maximum entropy algorithm.
First we have create a training dataset. training data should include category and content. normally there should be more than 5000 training data set for get a fine model.

Other good morning /
Other good evening /
Other have you any update on negombo road till wattala /
Other perhaps the madness was always there but only the schools bring it out? /
Other sorry didn't notice geotag /
Feed high traffic in wattala /
Feed low traffic in negombo road /
Feed moving traffic in wattala /
Feed nawala bridge area clear /
Feed no traffic at all at ja-ela /

Then we need to train a nlp categorizer model according to the dataset. Therefore you can easily go through OpenNLP documentation and train you model.
This following code can be used to train categorizing model and testing. here I have used training parameters as 2 cutoff mark and 300 iterations.

import opennlp.tools.doccat.DoccatModel;
import opennlp.tools.doccat.DocumentCategorizerME;
import opennlp.tools.doccat.DocumentSample;
import opennlp.tools.doccat.DocumentSampleStream;
import opennlp.tools.util.ObjectStream;
import opennlp.tools.util.PlainTextByLineStream;
import org.apache.log4j.Logger;

import java.io.*;

public class FeedClassifierTrainer {

 private static final Logger log = Logger.getLogger(FeedClassifierTrainer.class);

 private static DoccatModel model = null;

 public static void main(String[] args) {
  log.debug("Model training started");
  //to train the model
 new FeedClassifierTrainer().train();
  //testing purpose
  String content = "due to train strike heavy traffic in maradhana ";
  try {
   //test the model
   new FeedClassifierTrainer().test(content);
  } catch (IOException e) {
   e.printStackTrace();
  }
 }

 /**
  * Training the models
  */
 public void train() {
  // model name you define your own name for the model
  String onlpModelPath = "en-doccat.bin";
  // training data set
  String trainingDataFilePath = "data.txt";

  InputStream dataInputStream = null;
  try {
   // Read training data file
   dataInputStream = new FileInputStream(trainingDataFilePath);
   // Read each training instance
   ObjectStream lineStream = new PlainTextByLineStream(dataInputStream, "UTF-8");
   // making sample Stream to train
   ObjectStream sampleStream = new DocumentSampleStream(lineStream);
   // Calculate the training model "en" means english, sampleStream is the training data, 2 cutoff, 300 iterations
   model = DocumentCategorizerME.train("en", sampleStream, 2, 300);
  } catch (IOException e) {
   log.error(e.getMessage());
  } finally {
   if (dataInputStream != null) {
    try {
     dataInputStream.close();
    } catch (IOException e) {
     log.error(e.getMessage());
    }
   }
  }


 // Now we are writing the calculated model to a file in order to use the
 // trained classifier in production

try {
   if (model != null) {
  //saving the file 
    model.serialize(new FileOutputStream(onlpModelPath));
   }
  } catch (IOException e) {
   log.error(e.getMessage());
  }
 }

 /*
  * Now we call the saved model and test it
  * Give it a new text document and the expected category
  */
 public void test(String text) throws IOException {
  String classificationModelFilePath = "en-doccat.bin";
  DocumentCategorizerME classificationME =
    new DocumentCategorizerME(
      new DoccatModel(
        new FileInputStream(
          classificationModelFilePath)));
  String documentContent = text;
  double[] classDistribution = classificationME.categorize(documentContent);
  // get the predicted model
  String predictedCategory = classificationME.getBestCategory(classDistribution);
  System.out.println("Model prediction : " + predictedCategory);

 }
}

Saturday, October 25, 2014

Solving Java.lang.NoClassDefFoundError in Maven

Most developers find this error when they run executables in the terminal. Commonly these types of errors don't occur in Integrated Development Environments (IDE). NoClassDefFoundError in Java comes when Java Virtual Machine is not able to find a particular class at runtime which was available during compile time.

Thursday, October 16, 2014

Converting HTK Binary MFCC values in to ASCII MFCC

HTK is a Open Source Toolkit for Hidden Markov models developed by Cambridge university . mostly people use this tool kit for speech recognition and speech synthesis purposes.