4.5. TensorBoard Usage

TensorBoard provides the visualization and tooling needed for machine learning experimentation. A full instruction of tensorboard can be found here.

4.5.1. Highlighted features

DeePMD-kit can now use most of the interesting features enabled by tensorboard!

  • Tracking and visualizing metrics, such as l2_loss, l2_energy_loss and l2_force_loss

  • Visualizing the model graph (ops and layers)

  • Viewing histograms of weights, biases, or other tensors as they change over time.

  • Viewing summaries of trainable viriables

4.5.2. How to use Tensorboard with DeePMD-kit

Before running TensorBoard, make sure you have generated summary data in a log directory by modifying the the input script, set “tensorboard” true in training subsection will enable the tensorboard data analysis. eg. water_se_a.json.

    "training" : {
	"systems":	["../data/"],
	"set_prefix":	"set",    
	"stop_batch":	1000000,
	"batch_size":	1,

	"seed":		1,

	"_comment": " display and restart",
	"_comment": " frequencies counted in batch",
	"disp_file":	"lcurve.out",
	"disp_freq":	100,
	"numb_test":	10,
	"save_freq":	1000,
	"save_ckpt":	"model.ckpt",

	"disp_training":true,
	"time_training":true,
	"tensorboard":	true,
	"tensorboard_log_dir":"log",
	"tensorboard_freq": 1000,
	"profiling":	false,
	"profiling_file":"timeline.json",
	"_comment":	"that's all"
    }

Once you have event files, run TensorBoard and provide the log directory. This should print that TensorBoard has started. Next, connect to http://tensorboard_server_ip:6006.

TensorBoard requires a logdir to read logs from. For info on configuring TensorBoard, run tensorboard –help. One can easily change the log name with “tensorboard_log_dir” and the sampling frequency with “tensorboard_freq”.

tensorboard --logdir path/to/logs

4.5.3. Examples

4.5.3.1. Tracking and visualizing loss metrics(red:train, blue:test)

ALT

ALT

ALT

4.5.3.2. Visualizing deepmd-kit model graph

ALT

4.5.3.3. Viewing histograms of weights, biases, or other tensors as they change over time

ALT

ALT

4.5.3.4. Viewing summaries of trainable variables

ALT

4.5.4. Attention

Allowing the tensorboard analysis will takes extra execution time.(eg, 15% increasing @Nvidia GTX 1080Ti double precision with default water sample)

TensorBoard can be used in Google Chrome or Firefox. Other browsers might work, but there may be bugs or performance issues.