Sef4j Github Project Information

https://github.com/Arnaud-Nauwynck/sef4j

this page: <a href=”http://arnaud-nauwynck.github.io/2015/04/10/sef4j.html>http://arnaud-nauwynck.github.io/2015/04/10/sef4j.html</a>

other decriptions (slides to come soon)

Get Source

$ git clone --depth 1 https://github.com/Arnaud-Nauwynck/sef4j.git
$ cd sef4j
$ mvn clean install -DskipTests

Use it in your project… using maven pom.xml:

   <properties>
     <sef4j.version>1.0.0</sef4j.version>
   </properties>
   ...
   <dependency>
     <groupId>org.sef4j</groupId>
     <artifactId>sef4j-callstack</artifactId>
     <version>${sef4j.version}</version>
   </dependency>

See also : source code for documentation.
Some important classes contains ascii-art doc comments. Examples: CallStack.java, PropTreeNode.java, FragmentsProvidersExporter.java

As of 2015-04, there is only me as developper… All contributors are welcome!
This project is inspired from other past pojects developped and used in production, but this project itself is not currently used in production… only a ‘clone’ of it.

Sef4j - Name

Sef4j stands as initials for “Simple Event Facade for Java”.

This is not a typo, its name really looks like “Slf4j : Simple Logging Facade for Java”

The motto of Sef4j
Sef4j is to Event Monitoring what Slf4j is to Log : Simple and Efficient

It tries to offer a simple API for Events and Monitoring, as simple as slf4j (previously Log4J) is to Log.
Maybe “Semf4j” would be a more appropriate name (M for monitoring)?

Another project name could have been “Performance Monitoring Facade 4 Java” : perf4j, but the name is already registered. see https://github.com/perf4j/perf4j Unfortunatly, the project seems to do very few things compared to mine (maybe a pull request for them?).

Sef4j, is a “yet-another” performance monitoring framework, which has now many free/commercial equivalent. Open source “equivalent framework” would be Dropwizard Metrics (formely Codahale Metrics), JavaMelody, or many other… Commercial equivalent are ultra impressives (and ultra expensive…).

Sef4j - Goals : How to consider it

Sef4j aims to be not only a “Performance” profiler framework, but also an “Event” framework, for plugging Event Handlers as “Appenders” onto simple “Event Logger” from code, so as to produce logging, monitoring, alerts … to ElasticSearch, InfluxDB, Zooker …

Before Sef4j you add a lot of specialized monitoring frameworks for J2EE apis, for Jdbc, Servlets, … but nothing for instrumenting your own code.

Don’t consider Sef4j as a limited framework for events only, of for performance mertics only, or of j2ee profiling only… Sef4j is for all you can do to enrich performance and monitoring of events in your code.

When you want logs in the jvm (not only java), you don't call "System.out.println()" (shame on you if you still do).
You just add slf4j + logback to your CLASSPATH, put a logback.groovy (xml) file...
And you know it will "just work fine" without worrying about it.
If you wan't more appenders, you will have more appenders extensions.

Now transpose this sentence to Sef4j:

When you want performance and monitoring in the jvm (not only java), you don't call "System.currentTimeMillis()" (shame on you if you still do).
You just add sef4j to your CLASSPATH, put a sef4j.groovy (xml) file...
And you know it will "just work fine" without worrying about it.
If you wan't more performance metrics and monitoring, you will have more extensions.

Sef4j - Non Goals

Sef4j is not an even-driven framework for chaining reactive events as an asynchronous programmation model. See Akka instead, or any other projects</br> Sef4j is not an EventSource framework for doing CQRS programmintg style.</BR> Sef4j is not a backend store for events, or a middleware. See Apache Kafka, Apache ActiveMQ, ZeroMQ, or any other *-MQ framework for that.</BR>

However, Sef4j will support Appenders for pushing event to these backend systems.

Difference with DropWizard Metrics : Tree of Statistics Counters

The big differences with Sef4j compared to Dropwizard Metrics is that Sef4j captures applicative stack traces, not flat traces counters like Metrics. But it also allows you to manually instrument applicative parameters that make sense in your business code, and use them to categorize differents types of counters for the same stack trace. For example in a financial application, you may want to have several counters for the same pricing method depending of the portfolio used, instruments, thirdparties… because different rules are used to dispatch to different algorithms and data repositories.

Counters may be in-memory tree, that aggregate events with predefined or user-defined listener code.
The framework for instrumenting follows open-close principles! … If offers you events for push() and pop() for every instrumented methods, you are free to treat these events for aggregating what make sense, and producing counter, histograms, alerts, events, logs…

Here is a typical example showing a tree of statistic counters, compared to flat statistic list as in Dropwizard Metrics.
Say you have 2 entry points Web services, internally calling several services, and then DAOs jdbc queries.

In Dropwizard Metrics, you would instrument it like that:

public class MyWebService {
   @Timed
   public MyResult myService1(param1, param2...) {
      dao.myQuery1(..)
   }
   @Timed
   public MyResult myService2(param1, param2...) {
      dao.myQuery1(..)
      dao.myQuery2(..)      
   }
}

public class MyDAO {
   @Timed
   public MyResult myQuery1(param1, param2...) {
      jdbc / jpa query...
   }
   @Timed
   public MyResult myQuery2(param1, param2...) {
      jdbc / jpa query...
   }
}

In DropWizard Metrics, You will get 4 flat metric counters:

MyWebService.myServices1  --> counter stats, ..
MyWebService.myServices2  --> counter stats, ..
MyDAO.myQuery1  --> counter stats, ..
MyDAO.myQuery2  --> counter stats, ..

How can you known how are linked performance problems in DAO.myQuery1 and MyWebService entrypoint 1/2 ?.. You can’t.

In Sef4j as in many J2EE profiler tools, you will get a tree of performance counters

public class MyWebService {
   @Sef4j
   public MyResult myService1(param1, param2...) {
      ...
   }
}
/
+-- MyWebService.myServices1  --> counter stats, ..
|     |
|     +-- MyDAO.myQuery1  --> counter stats, ..
|     +-- MyDAO.myQuery2  --> counter stats, ..
|
+-- MyWebService.myServices2  --> counter stats, ..
      |
      +-- MyDAO.myQuery1  --> counter stats, ..
      +-- MyDAO.myQuery2  --> counter stats, ..   

Internally, it calls push and pop events, and dispatch them to thread local handlers in a very efficient way (few nanos of overhead only, no temporary object allocation)… Therefore the name Sef4j.

For free, you can also have logs if you want them, and only on demand on method start/stop or when method is slower than expected:

public class MyWebService {
   @Sef4j(logPush=false, logPop=300) // <= log on pop only, when time > 300 millis
   public MyResult myService1(param1, param2...) {
      ...
   }
}
[yyyy/MM/dd HH:mm:ss.sss] MyWebService.myService1() ... took: 500 millis

Statistic Histogram and Pending Metrics Counters

By default, you have handlers that increment in-memory statistic counter tree :

/
+-- MyWebService.myServices1  --> Histogram stats + Pending Counters
|     |
|     +-- MyDAO.myQuery1  --> Histogram stats + Pending Counters
|     +-- MyDAO.myQuery2  --> Histogram stats + Pending Counters
|
+-- MyWebService.myServices2  --> Histogram stats + Pending Counters
      |
      +-- MyDAO.myQuery1  --> Histogram stats + Pending Counters
      +-- MyDAO.myQuery2  --> Histogram stats + Pending Counters

For each counter stats, you have an efficient lock-free data structure that contains:

  • current number of pending threads + sum of pending threads starttime
  • number of finished executions + sum of execution time, using log-based histogram per time

Histograms are hard-coded with 10 buckets, using this logarithmic range:

  • bucket[0]: 0 millis
  • bucket[1]: 1 - 31 millis
  • bucket[2]: 32 - 63 millis
  • bucket[3]: 64 - 127 millis
  • bucket[4]: 128 - 255 millis
  • bucket[5]: 256 - 511 millis
  • bucket[6]: 512 - 1023 millis
  • bucket[7]: 1024 - 2047 millis
  • bucket[8]: 2048 - 4095 millis
  • bucket[9]: more than 4096 millis

The finished executions times make sense to display as long-term graph: refresh every 30 minutes in Graphana + InfluxDB, aggregate per method using all buckets … It shows the average speed of your method. When zooming into details, you can have details per method, and find fast or exceptionnaly slow executions.

The pending thread information can be used to display the current running activity of you server. It particulary makes sense to display it as pseudo-realtime grapĥ: refresh every 30 seconds in Graphana + InfluxDB, aggregated for all entry-points…

You may also raise alarms base on threshold for suspect method not ending fast enough.

Internally, the pending data structure counter is also a very interresting and efficient (lock-free, no memory allocation) data structure!

Here is an explanation Suppose you have these chronology:

  • at time t1: Thread1 - enter => pendingCount=1, pendingSumStart=t1
  • at time t2: Thread2 - enter => pendingCount=2, pendingSumStart=t1+t2
  • at time t3: Thread3 - enter => pendingCount=3, pendingSumStart=t1+t2+t3
  • at time t4: Thread1 - leave => pendingCount=2, pendingSumStart=t2+t3
  • at time t5: Thread3 - leave => pendingCount=1, pendingSumStart=t2
  • ...

At any time (t), you have the number of threads running a method, and the sum of time already consumed by theses threads. Example: t between t4 and t5 => you have 2 threads running, then entered in average at time (t2+t3)/2, so then are running in average from “t - (t2+t3)/2” millis. The total time/Cpu already elapsed is “t2+t3 - 2*t”

Displaying Statistics Counters Tree Structure as Flat List with Group-By

If you still want to show a flat list of statistic counters from this rich tree data structure, you can: it is only a matter of grouping and summing per node name.

sum(**/MyDAO.myQuery1) =   
     MyWebService.myServices1/MyDAO.myQuery1
   + MyWebService.myServices2/MyDAO.myQuery1

A typical usage is to find the worst jdbc queries, independently of where they are called (cf jdbc-logger tool in previous post).

Instrumented Method - Applicative Stack Trace - Applicative Tree Path

What I call “Applicative Stack Trace” is the filtering of the real java stack trace when preserving only nodes of interrests : your annotated/instrumented methods, not all this java.lang., com.sun.reflect., org.springframework.* intermediates nodes you have…

at myapp.MyDAO
at myapp.MyWebService

instead of

at com.oracle.driver...
at org.sef4j ...
at org.hibernate...
at org.springframework...
at java.lang.reflect..
at 
at myapp.MyDAO
at 
at myapp.MyWebService
at ..
at org.springframework
at javax.servlet..
at tomcat..

With Sef4j, you can instrument method basically by putting a pre()/post() aspect within your method

public class MyWebService {

   @Sef4j // <== using whatever AOP framework (Aspectj, ...)
   public MyResult myService1(param1, param2...) {
      ...
   }
   
}

If you prefer, you can write the equivalent java code yourself, which is exactly 2 lines (+ 1 line for import) for jdk8, 5 lines for jdk < 8:

import org.sef4j.callstack.LocalCallStack;

public class MyWebService {

 /** instrumented code with sef4j, using jdk8 Closeable try-finally */
 public MyResult myService1(param1, param2...) {
   try (StackPopper toPop = LocalCallStack.meth("MyWebService", "myService1").push()) {
      ...
   }
 }
 /** instrumented code with sef4j, using explicit try-finally (jdk<8) */
 public MyResult myService1(param1, param2...) {
   StackPopper toPop = LocalCallStack.meth("MyWebService", "myService1").push();
   try {
      ...
   } finally {
      toPop.close();
   }
 }
}

you can enrich your method instrumentation : you have a fluent api to build information on the current method of the current thread, put info in it, then call push().

StackPopper toPop = LocalCallStack.method("MyWebService", "myService1")
        .withParam("param1", param1) // <= add method parameter info
        .withInheritedProp("param2", param2) // <= add inherited property (will be available as ThreadLocal for all child calls)
        .withLog(LOG, false, 300) // <= no log in push(), log in pop() only if time exceed 300 ms
        .withProgressExpectedCount(20) // <= expected 20 progress steps in a loop 
        .push();
        

Pushed method don’t have to be real method, virtual nodes can be introduced to split counters per applicative criteria

public class MyWebService {
   public MyResult myService1(param1, param2...) {
		try(AppCallStack.method("myService").WithParam("param1", param1).push())   // ==> attach inherited property to current Thread
			try (AppCallStack.method("perParam:" + param1)) {
			
				MyDAO.myQuery1()
				..
			} // => finally pop("perParam")
		}// => finally... pop("myService")
   }
}

You will get a tree of stat counter like

/
+-- MyWebService.myServices1  --> counter stats, ..
|     |
|     +-- perParam1:value1
|     |    +-- MyDAO.myQuery1  --> counter stats, ..
|     |    +-- MyDAO.myQuery2  --> counter stats, ..
|     +-- perParam1:value2
|     |    +-- MyDAO.myQuery1  --> counter stats, ..
|     |    +-- MyDAO.myQuery2  --> counter stats, ..
|     +-- perParam1:value3
|     |    +-- MyDAO.myQuery1  --> counter stats, ..
|     |    +-- MyDAO.myQuery2  --> counter stats, ..
|     ...