Package evaluation of DeepQLearning on Julia 1.13.0-DEV.1080 (ed57414aec*) started at 2025-09-05T00:03:17.722 ################################################################################ # Set-up # Installing PkgEval dependencies (TestEnv)... Set-up completed after 9.42s ################################################################################ # Installation # Installing DeepQLearning... Resolving package versions... Updating `~/.julia/environments/v1.13/Project.toml` [de0a67f4] + DeepQLearning v0.7.2 Updating `~/.julia/environments/v1.13/Manifest.toml` [621f4979] + AbstractFFTs v1.5.0 [7d9f7c33] + Accessors v0.1.42 [79e6a3ab] + Adapt v4.3.0 [66dad0bd] + AliasTables v1.1.3 [dce04be8] + ArgCheck v2.5.0 [ec485272] + ArnoldiMethod v0.4.0 [4fba245c] + ArrayInterface v7.20.0 [a9b6321e] + Atomix v1.1.2 [fbb218c0] + BSON v0.3.9 [198e06fe] + BangBang v0.4.4 [9718e550] + Baselet v0.1.1 [e1450e63] + BufferedStreams v1.2.2 [fa961155] + CEnum v0.5.0 [082447d4] + ChainRules v1.72.5 [d360d2e6] + ChainRulesCore v1.26.0 [35d6a980] + ColorSchemes v3.30.0 [3da002f7] + ColorTypes v0.12.1 [c3611d14] + ColorVectorSpace v0.11.0 [5ae59095] + Colors v0.13.1 [d842c3ba] + CommonRLInterface v0.3.3 [bbf7d656] + CommonSubexpressions v0.3.1 [f70d9fcc] + CommonWorldInvalidations v1.0.0 [34da2185] + Compat v4.18.0 [a33af91c] + CompositionsBase v0.1.2 [187b0558] + ConstructionBase v1.6.0 [6add18c4] + ContextVariablesX v0.1.3 [d38c429a] + Contour v0.6.3 [a8cc5b0e] + Crayons v4.1.1 [9a962f9c] + DataAPI v1.16.0 [a93c6f00] + DataFrames v1.7.1 [864edb3b] + DataStructures v0.19.1 [e2d170a0] + DataValueInterfaces v1.0.0 [de0a67f4] + DeepQLearning v0.7.2 [244e2a9f] + DefineSingletons v0.1.2 [8bb1440f] + DelimitedFiles v1.9.1 [163ba53b] + DiffResults v1.1.0 [b552c78f] + DiffRules v1.15.1 [31c24e10] + Distributions v0.25.120 [ffbed154] + DocStringExtensions v0.9.5 [da5c29d0] + EllipsisNotation v1.8.0 [4e289a0a] + EnumX v1.0.5 [cc61a311] + FLoops v0.2.2 [b9860ae5] + FLoopsBase v0.1.1 [5789e2e9] + FileIO v1.17.0 [1a297f60] + FillArrays v1.13.0 [53c48c17] + FixedPointNumbers v0.8.5 ⌅ [587475ba] + Flux v0.14.25 [f6369f11] + ForwardDiff v1.1.0 ⌅ [d9f16b24] + Functors v0.4.12 [0c68f7d7] + GPUArrays v11.2.4 [46192b85] + GPUArraysCore v0.2.0 [86223c79] + Graphs v1.13.1 [076d061b] + HashArrayMappedTries v0.2.0 [34004b35] + HypergeometricFunctions v0.3.28 [7869d1d1] + IRTools v0.4.15 [615f187c] + IfElse v0.1.1 [a09fc81d] + ImageCore v0.10.5 [d25df0c9] + Inflate v0.1.5 [22cec73e] + InitialValues v0.3.1 [842dd82b] + InlineStrings v1.4.5 [3587e190] + InverseFunctions v0.1.17 [41ab1584] + InvertedIndices v1.3.1 [92d709cd] + IrrationalConstants v0.2.4 [82899510] + IteratorInterfaceExtensions v1.0.0 [692b3bcd] + JLLWrappers v1.7.1 [b14d175d] + JuliaVariables v0.2.4 [63c18a36] + KernelAbstractions v0.9.38 [929cbde3] + LLVM v9.4.2 [b964fa9f] + LaTeXStrings v1.4.0 [2ab3a3ac] + LogExpFunctions v0.3.29 [c2834f40] + MLCore v1.0.0 ⌃ [7e8f7934] + MLDataDevices v1.5.3 [d8e11817] + MLStyle v0.4.17 [f1d291b0] + MLUtils v0.4.8 [1914dd2f] + MacroTools v0.5.16 [dbb5928d] + MappedArrays v0.4.2 [299715c1] + MarchingCubes v0.1.11 [128add7d] + MicroCollections v0.2.0 [e1d29d7a] + Missings v1.2.0 [e94cdb99] + MosaicViews v0.3.4 [872c559c] + NNlib v0.9.31 [77ba4419] + NaNMath v1.1.3 [71a1bf82] + NameResolution v0.1.5 [d9ec5142] + NamedTupleTools v0.14.3 [6fe1bfb0] + OffsetArrays v1.17.0 [0b1bfda6] + OneHotArrays v0.2.10 ⌅ [3bd65402] + Optimisers v0.3.4 [bac558e1] + OrderedCollections v1.8.1 [90014a1f] + PDMats v0.11.35 [f3bd98c0] + POMDPLinter v0.1.2 [7588e00f] + POMDPTools v1.1.0 [a93abf59] + POMDPs v1.0.0 [5432bcbf] + PaddedViews v0.5.12 [d96e819e] + Parameters v0.12.3 [2dfb63ee] + PooledArrays v1.4.3 [aea7be01] + PrecompileTools v1.3.3 [21216c6a] + Preferences v1.5.0 [8162dcfd] + PrettyPrint v0.2.0 ⌅ [08abe8d2] + PrettyTables v2.4.0 [33c8b6b6] + ProgressLogging v0.1.5 [92933f4c] + ProgressMeter v1.11.0 [3349acd9] + ProtoBuf v1.1.1 [43287f4e] + PtrArrays v1.3.0 [1fd47b50] + QuadGK v2.11.2 [c1ae055f] + RealDot v0.1.0 [189a3867] + Reexport v1.2.2 [ae029012] + Requires v1.3.1 [79098fc4] + Rmath v0.8.0 [7e506255] + ScopedValues v1.5.0 [91c51154] + SentinelArrays v1.4.8 [efcf1570] + Setfield v1.1.2 [605ecd9f] + ShowCases v0.1.0 [699a6c99] + SimpleTraits v0.9.5 [a2af1166] + SortingAlgorithms v1.2.2 [dc90abb0] + SparseInverseSubset v0.1.2 [276daf66] + SpecialFunctions v2.5.1 [171d559e] + SplittablesBase v0.1.15 [cae243ae] + StackViews v0.1.2 [aedffcd0] + Static v1.2.0 [0d7ed370] + StaticArrayInterface v1.8.0 [90137ffa] + StaticArrays v1.9.15 [1e83bf80] + StaticArraysCore v1.4.3 [10745b16] + Statistics v1.11.1 [82ae8749] + StatsAPI v1.7.1 [2913bbd2] + StatsBase v0.34.6 [4c63d2b9] + StatsFuns v1.5.0 [892a3eda] + StringManipulation v0.4.1 [09ab397b] + StructArrays v0.7.1 [3783bdb8] + TableTraits v1.0.1 [bd369af6] + Tables v1.12.1 [899adc3e] + TensorBoardLogger v0.1.26 [62fd8b95] + TensorCore v0.1.1 [28d57a85] + Transducers v0.4.84 [410a4b4d] + Tricks v0.1.12 [3a884ed6] + UnPack v1.0.2 [b8865327] + UnicodePlots v3.8.1 [013be700] + UnsafeAtomics v0.3.0 ⌅ [e88e6eb3] + Zygote v0.6.77 [700de1a5] + ZygoteRules v0.2.7 [dad2f222] + LLVMExtra_jll v0.0.37+2 [efe28fd5] + OpenSpecFun_jll v0.5.6+0 [f50d1b31] + Rmath_jll v0.5.1+0 [0dad84c5] + ArgTools v1.1.2 [56f22d72] + Artifacts v1.11.0 [2a0f44e3] + Base64 v1.11.0 [8bf52ea8] + CRC32c v1.11.0 [ade2ca70] + Dates v1.11.0 [8ba89e20] + Distributed v1.11.0 [f43a241f] + Downloads v1.7.0 [7b1f6079] + FileWatching v1.11.0 [9fa8497b] + Future v1.11.0 [b77e0a4c] + InteractiveUtils v1.11.0 [ac6e5ff7] + JuliaSyntaxHighlighting v1.12.0 [4af54fe1] + LazyArtifacts v1.11.0 [b27032c2] + LibCURL v0.6.4 [76f85450] + LibGit2 v1.11.0 [8f399da3] + Libdl v1.11.0 [37e2e46d] + LinearAlgebra v1.13.0 [56ddb016] + Logging v1.11.0 [d6f4376e] + Markdown v1.11.0 [a63ad114] + Mmap v1.11.0 [ca575930] + NetworkOptions v1.3.0 [44cfe95a] + Pkg v1.13.0 [de0858da] + Printf v1.11.0 [9a3f8284] + Random v1.11.0 [ea8e919c] + SHA v0.7.0 [9e88b42a] + Serialization v1.11.0 [1a1011a3] + SharedArrays v1.11.0 [6462fe0b] + Sockets v1.11.0 [2f01184e] + SparseArrays v1.13.0 [f489334b] + StyledStrings v1.11.0 [4607b0f0] + SuiteSparse [fa267f1f] + TOML v1.0.3 [a4e569a6] + Tar v1.10.0 [8dfed614] + Test v1.11.0 [cf7118a7] + UUIDs v1.11.0 [4ec0a83e] + Unicode v1.11.0 [e66e0078] + CompilerSupportLibraries_jll v1.3.0+1 [deac9b47] + LibCURL_jll v8.15.0+1 [e37daf67] + LibGit2_jll v1.9.1+0 [29816b5a] + LibSSH2_jll v1.11.3+1 [14a3606d] + MozillaCACerts_jll v2025.8.12 [4536629a] + OpenBLAS_jll v0.3.29+0 [05823500] + OpenLibm_jll v0.8.7+0 [458c3c95] + OpenSSL_jll v3.5.2+0 [efcefdf7] + PCRE2_jll v10.46.0+0 [bea87d4a] + SuiteSparse_jll v7.10.1+0 [83775a58] + Zlib_jll v1.3.1+2 [3161d3a3] + Zstd_jll v1.5.7+1 [8e850b90] + libblastrampoline_jll v5.13.1+0 [8e850ede] + nghttp2_jll v1.67.0+0 [3f19e933] + p7zip_jll v17.6.0+0 Info Packages marked with ⌃ and ⌅ have new versions available. Those with ⌃ may be upgradable, but those with ⌅ are restricted by compatibility constraints from upgrading. To see why use `status --outdated -m` Installation completed after 6.14s ################################################################################ # Precompilation # Precompiling PkgEval dependencies... ┌ Warning: Could not use exact versions of packages in manifest, re-resolving └ @ TestEnv ~/.julia/packages/TestEnv/nGMfF/src/julia-1.11/activate_set.jl:76 Precompiling package dependencies... Precompilation completed after 473.59s ################################################################################ # Testing # Testing DeepQLearning Test Could not use exact versions of packages in manifest, re-resolving. Note: if you do not check your manifest file into source control, then you can probably ignore this message. However, if you do check your manifest file into source control, then you probably want to pass the `allow_reresolve = false` kwarg when calling the `Pkg.test` function. Updating `/tmp/jl_O1fkLH/Project.toml` [de0a67f4] + DeepQLearning v0.7.2 [355abbd5] + POMDPModels v0.4.21 Updating `/tmp/jl_O1fkLH/Manifest.toml` [a81c6b42] + Compose v0.9.6 ⌅ [864edb3b] ↓ DataStructures v0.19.1 ⇒ v0.18.22 [de0a67f4] + DeepQLearning v0.7.2 [c8e1da08] + IterTools v1.10.0 [682c06a0] + JSON v0.21.4 [442fdcdd] + Measures v0.3.2 [355abbd5] + POMDPModels v0.4.21 [69de0a69] + Parsers v2.8.3 Info Packages marked with ⌅ have new versions available but compatibility constraints restrict them from upgrading. To see why use `status --outdated -m` Test Successfully re-resolved Status `/tmp/jl_O1fkLH/Project.toml` [fbb218c0] BSON v0.3.9 [d842c3ba] CommonRLInterface v0.3.3 [de0a67f4] DeepQLearning v0.7.2 [da5c29d0] EllipsisNotation v1.8.0 ⌅ [587475ba] Flux v0.14.25 [f3bd98c0] POMDPLinter v0.1.2 [355abbd5] POMDPModels v0.4.21 [7588e00f] POMDPTools v1.1.0 [a93abf59] POMDPs v1.0.0 [d96e819e] Parameters v0.12.3 [90137ffa] StaticArrays v1.9.15 [2913bbd2] StatsBase v0.34.6 [899adc3e] TensorBoardLogger v0.1.26 [37e2e46d] LinearAlgebra v1.13.0 [de0858da] Printf v1.11.0 [9a3f8284] Random v1.11.0 [8dfed614] Test v1.11.0 Status `/tmp/jl_O1fkLH/Manifest.toml` [621f4979] AbstractFFTs v1.5.0 [7d9f7c33] Accessors v0.1.42 [79e6a3ab] Adapt v4.3.0 [66dad0bd] AliasTables v1.1.3 [dce04be8] ArgCheck v2.5.0 [ec485272] ArnoldiMethod v0.4.0 [4fba245c] ArrayInterface v7.20.0 [a9b6321e] Atomix v1.1.2 [fbb218c0] BSON v0.3.9 [198e06fe] BangBang v0.4.4 [9718e550] Baselet v0.1.1 [e1450e63] BufferedStreams v1.2.2 [fa961155] CEnum v0.5.0 [082447d4] ChainRules v1.72.5 [d360d2e6] ChainRulesCore v1.26.0 [35d6a980] ColorSchemes v3.30.0 [3da002f7] ColorTypes v0.12.1 [c3611d14] ColorVectorSpace v0.11.0 [5ae59095] Colors v0.13.1 [d842c3ba] CommonRLInterface v0.3.3 [bbf7d656] CommonSubexpressions v0.3.1 [f70d9fcc] CommonWorldInvalidations v1.0.0 [34da2185] Compat v4.18.0 [a81c6b42] Compose v0.9.6 [a33af91c] CompositionsBase v0.1.2 [187b0558] ConstructionBase v1.6.0 [6add18c4] ContextVariablesX v0.1.3 [d38c429a] Contour v0.6.3 [a8cc5b0e] Crayons v4.1.1 [9a962f9c] DataAPI v1.16.0 [a93c6f00] DataFrames v1.7.1 ⌅ [864edb3b] DataStructures v0.18.22 [e2d170a0] DataValueInterfaces v1.0.0 [de0a67f4] DeepQLearning v0.7.2 [244e2a9f] DefineSingletons v0.1.2 [8bb1440f] DelimitedFiles v1.9.1 [163ba53b] DiffResults v1.1.0 [b552c78f] DiffRules v1.15.1 [31c24e10] Distributions v0.25.120 [ffbed154] DocStringExtensions v0.9.5 [da5c29d0] EllipsisNotation v1.8.0 [4e289a0a] EnumX v1.0.5 [cc61a311] FLoops v0.2.2 [b9860ae5] FLoopsBase v0.1.1 [5789e2e9] FileIO v1.17.0 [1a297f60] FillArrays v1.13.0 [53c48c17] FixedPointNumbers v0.8.5 ⌅ [587475ba] Flux v0.14.25 [f6369f11] ForwardDiff v1.1.0 ⌅ [d9f16b24] Functors v0.4.12 [0c68f7d7] GPUArrays v11.2.4 [46192b85] GPUArraysCore v0.2.0 [86223c79] Graphs v1.13.1 [076d061b] HashArrayMappedTries v0.2.0 [34004b35] HypergeometricFunctions v0.3.28 [7869d1d1] IRTools v0.4.15 [615f187c] IfElse v0.1.1 [a09fc81d] ImageCore v0.10.5 [d25df0c9] Inflate v0.1.5 [22cec73e] InitialValues v0.3.1 [842dd82b] InlineStrings v1.4.5 [3587e190] InverseFunctions v0.1.17 [41ab1584] InvertedIndices v1.3.1 [92d709cd] IrrationalConstants v0.2.4 [c8e1da08] IterTools v1.10.0 [82899510] IteratorInterfaceExtensions v1.0.0 [692b3bcd] JLLWrappers v1.7.1 [682c06a0] JSON v0.21.4 [b14d175d] JuliaVariables v0.2.4 [63c18a36] KernelAbstractions v0.9.38 [929cbde3] LLVM v9.4.2 [b964fa9f] LaTeXStrings v1.4.0 [2ab3a3ac] LogExpFunctions v0.3.29 [c2834f40] MLCore v1.0.0 ⌃ [7e8f7934] MLDataDevices v1.5.3 [d8e11817] MLStyle v0.4.17 [f1d291b0] MLUtils v0.4.8 [1914dd2f] MacroTools v0.5.16 [dbb5928d] MappedArrays v0.4.2 [299715c1] MarchingCubes v0.1.11 [442fdcdd] Measures v0.3.2 [128add7d] MicroCollections v0.2.0 [e1d29d7a] Missings v1.2.0 [e94cdb99] MosaicViews v0.3.4 [872c559c] NNlib v0.9.31 [77ba4419] NaNMath v1.1.3 [71a1bf82] NameResolution v0.1.5 [d9ec5142] NamedTupleTools v0.14.3 [6fe1bfb0] OffsetArrays v1.17.0 [0b1bfda6] OneHotArrays v0.2.10 ⌅ [3bd65402] Optimisers v0.3.4 [bac558e1] OrderedCollections v1.8.1 [90014a1f] PDMats v0.11.35 [f3bd98c0] POMDPLinter v0.1.2 [355abbd5] POMDPModels v0.4.21 [7588e00f] POMDPTools v1.1.0 [a93abf59] POMDPs v1.0.0 [5432bcbf] PaddedViews v0.5.12 [d96e819e] Parameters v0.12.3 [69de0a69] Parsers v2.8.3 [2dfb63ee] PooledArrays v1.4.3 [aea7be01] PrecompileTools v1.3.3 [21216c6a] Preferences v1.5.0 [8162dcfd] PrettyPrint v0.2.0 ⌅ [08abe8d2] PrettyTables v2.4.0 [33c8b6b6] ProgressLogging v0.1.5 [92933f4c] ProgressMeter v1.11.0 [3349acd9] ProtoBuf v1.1.1 [43287f4e] PtrArrays v1.3.0 [1fd47b50] QuadGK v2.11.2 [c1ae055f] RealDot v0.1.0 [189a3867] Reexport v1.2.2 [ae029012] Requires v1.3.1 [79098fc4] Rmath v0.8.0 [7e506255] ScopedValues v1.5.0 [91c51154] SentinelArrays v1.4.8 [efcf1570] Setfield v1.1.2 [605ecd9f] ShowCases v0.1.0 [699a6c99] SimpleTraits v0.9.5 [a2af1166] SortingAlgorithms v1.2.2 [dc90abb0] SparseInverseSubset v0.1.2 [276daf66] SpecialFunctions v2.5.1 [171d559e] SplittablesBase v0.1.15 [cae243ae] StackViews v0.1.2 [aedffcd0] Static v1.2.0 [0d7ed370] StaticArrayInterface v1.8.0 [90137ffa] StaticArrays v1.9.15 [1e83bf80] StaticArraysCore v1.4.3 [10745b16] Statistics v1.11.1 [82ae8749] StatsAPI v1.7.1 [2913bbd2] StatsBase v0.34.6 [4c63d2b9] StatsFuns v1.5.0 [892a3eda] StringManipulation v0.4.1 [09ab397b] StructArrays v0.7.1 [3783bdb8] TableTraits v1.0.1 [bd369af6] Tables v1.12.1 [899adc3e] TensorBoardLogger v0.1.26 [62fd8b95] TensorCore v0.1.1 [28d57a85] Transducers v0.4.84 [410a4b4d] Tricks v0.1.12 [3a884ed6] UnPack v1.0.2 [b8865327] UnicodePlots v3.8.1 [013be700] UnsafeAtomics v0.3.0 ⌅ [e88e6eb3] Zygote v0.6.77 [700de1a5] ZygoteRules v0.2.7 [dad2f222] LLVMExtra_jll v0.0.37+2 [efe28fd5] OpenSpecFun_jll v0.5.6+0 [f50d1b31] Rmath_jll v0.5.1+0 [0dad84c5] ArgTools v1.1.2 [56f22d72] Artifacts v1.11.0 [2a0f44e3] Base64 v1.11.0 [8bf52ea8] CRC32c v1.11.0 [ade2ca70] Dates v1.11.0 [8ba89e20] Distributed v1.11.0 [f43a241f] Downloads v1.7.0 [7b1f6079] FileWatching v1.11.0 [9fa8497b] Future v1.11.0 [b77e0a4c] InteractiveUtils v1.11.0 [ac6e5ff7] JuliaSyntaxHighlighting v1.12.0 [4af54fe1] LazyArtifacts v1.11.0 [b27032c2] LibCURL v0.6.4 [76f85450] LibGit2 v1.11.0 [8f399da3] Libdl v1.11.0 [37e2e46d] LinearAlgebra v1.13.0 [56ddb016] Logging v1.11.0 [d6f4376e] Markdown v1.11.0 [a63ad114] Mmap v1.11.0 [ca575930] NetworkOptions v1.3.0 [44cfe95a] Pkg v1.13.0 [de0858da] Printf v1.11.0 [9a3f8284] Random v1.11.0 [ea8e919c] SHA v0.7.0 [9e88b42a] Serialization v1.11.0 [1a1011a3] SharedArrays v1.11.0 [6462fe0b] Sockets v1.11.0 [2f01184e] SparseArrays v1.13.0 [f489334b] StyledStrings v1.11.0 [4607b0f0] SuiteSparse [fa267f1f] TOML v1.0.3 [a4e569a6] Tar v1.10.0 [8dfed614] Test v1.11.0 [cf7118a7] UUIDs v1.11.0 [4ec0a83e] Unicode v1.11.0 [e66e0078] CompilerSupportLibraries_jll v1.3.0+1 [deac9b47] LibCURL_jll v8.15.0+1 [e37daf67] LibGit2_jll v1.9.1+0 [29816b5a] LibSSH2_jll v1.11.3+1 [14a3606d] MozillaCACerts_jll v2025.8.12 [4536629a] OpenBLAS_jll v0.3.29+0 [05823500] OpenLibm_jll v0.8.7+0 [458c3c95] OpenSSL_jll v3.5.2+0 [efcefdf7] PCRE2_jll v10.46.0+0 [bea87d4a] SuiteSparse_jll v7.10.1+0 [83775a58] Zlib_jll v1.3.1+2 [3161d3a3] Zstd_jll v1.5.7+1 [8e850b90] libblastrampoline_jll v5.13.1+0 [8e850ede] nghttp2_jll v1.67.0+0 [3f19e933] p7zip_jll v17.6.0+0 Info Packages marked with ⌃ and ⌅ have new versions available. Those with ⌃ may be upgradable, but those with ⌅ are restricted by compatibility constraints from upgrading. Testing Running tests... Precompiling packages... 8994.8 ms ✓ Compose 28713.3 ms ✓ POMDPModels 2 dependencies successfully precompiled in 44 seconds. 118 already precompiled. 500 / 10000 eps 0.901 | avgR -0.001 | Loss 7.418e-02 | Grad 9.045e-02 | EvalR -Inf 1000 / 10000 eps 0.802 | avgR 0.177 | Loss 8.516e-02 | Grad 5.236e-02 | EvalR -Inf 1500 / 10000 eps 0.703 | avgR 0.241 | Loss 3.363e-02 | Grad 4.933e-02 | EvalR -Inf 2000 / 10000 eps 0.604 | avgR 0.553 | Loss 2.811e-02 | Grad 3.963e-02 | EvalR -Inf Evaluation ... Avg Reward 2.00 | Avg Step 5.00 2500 / 10000 eps 0.505 | avgR 0.505 | Loss 2.485e-02 | Grad 4.884e-02 | EvalR 2.000 3000 / 10000 eps 0.406 | avgR 0.985 | Loss 1.784e-02 | Grad 5.140e-02 | EvalR 2.000 3500 / 10000 eps 0.307 | avgR 1.193 | Loss 1.305e-02 | Grad 1.445e-02 | EvalR 2.000 4000 / 10000 eps 0.208 | avgR 1.472 | Loss 3.622e-02 | Grad 6.899e-02 | EvalR 2.000 Evaluation ... Avg Reward 2.10 | Avg Step 5.00 Saving new model with eval reward 2.100 4500 / 10000 eps 0.109 | avgR 1.377 | Loss 6.950e-03 | Grad 3.648e-02 | EvalR 2.100 5000 / 10000 eps 0.010 | avgR 1.961 | Loss 4.442e-02 | Grad 2.751e-02 | EvalR 2.100 5500 / 10000 eps 0.010 | avgR 2.048 | Loss 6.467e-04 | Grad 9.072e-03 | EvalR 2.100 6000 / 10000 eps 0.010 | avgR 2.078 | Loss 7.904e-06 | Grad 2.474e-03 | EvalR 2.100 Evaluation ... Avg Reward 2.10 | Avg Step 5.00 Saving new model with eval reward 2.100 6500 / 10000 eps 0.010 | avgR 1.845 | Loss 3.623e-04 | Grad 1.265e-02 | EvalR 2.100 7000 / 10000 eps 0.010 | avgR 2.031 | Loss 1.488e-03 | Grad 1.572e-02 | EvalR 2.100 7500 / 10000 eps 0.010 | avgR 2.075 | Loss 7.097e-04 | Grad 6.582e-03 | EvalR 2.100 8000 / 10000 eps 0.010 | avgR 2.054 | Loss 2.055e-06 | Grad 5.702e-04 | EvalR 2.100 Evaluation ... Avg Reward 2.10 | Avg Step 5.00 8500 / 10000 eps 0.010 | avgR 2.047 | Loss 1.494e-04 | Grad 2.688e-03 | EvalR 2.100 9000 / 10000 eps 0.010 | avgR 2.077 | Loss 8.946e-05 | Grad 2.248e-03 | EvalR 2.100 9500 / 10000 eps 0.010 | avgR 2.076 | Loss 2.562e-06 | Grad 6.147e-04 | EvalR 2.100 10000 / 10000 eps 0.010 | avgR 2.076 | Loss 2.140e-06 | Grad 9.866e-04 | EvalR 2.100 Restore model with eval reward 2.100 Test Summary: | Pass Total Time vanilla DQN | 2 2 2m26.0s 500 / 10000 eps 0.901 | avgR -0.029 | Loss 3.960e-02 | Grad 7.405e-02 | EvalR -Inf 1000 / 10000 eps 0.802 | avgR 0.102 | Loss 3.929e-02 | Grad 2.780e-02 | EvalR -Inf 1500 / 10000 eps 0.703 | avgR 0.470 | Loss 2.473e-02 | Grad 3.728e-02 | EvalR -Inf 2000 / 10000 eps 0.604 | avgR 0.588 | Loss 2.197e-02 | Grad 4.640e-02 | EvalR -Inf Evaluation ... Avg Reward 2.00 | Avg Step 5.00 2500 / 10000 eps 0.505 | avgR 0.854 | Loss 5.706e-02 | Grad 1.212e-01 | EvalR 2.000 3000 / 10000 eps 0.406 | avgR 0.985 | Loss 1.326e-02 | Grad 3.517e-02 | EvalR 2.000 3500 / 10000 eps 0.307 | avgR 1.249 | Loss 3.265e-03 | Grad 2.011e-02 | EvalR 2.000 4000 / 10000 eps 0.208 | avgR 1.394 | Loss 1.118e-02 | Grad 7.234e-02 | EvalR 2.000 Evaluation ... Avg Reward 2.10 | Avg Step 5.00 Saving new model with eval reward 2.100 4500 / 10000 eps 0.109 | avgR 0.575 | Loss 5.980e-02 | Grad 5.101e-02 | EvalR 2.100 5000 / 10000 eps 0.010 | avgR 1.790 | Loss 6.331e-02 | Grad 3.773e-02 | EvalR 2.100 5500 / 10000 eps 0.010 | avgR 2.047 | Loss 2.518e-03 | Grad 1.522e-02 | EvalR 2.100 6000 / 10000 eps 0.010 | avgR 2.058 | Loss 9.965e-04 | Grad 2.212e-02 | EvalR 2.100 Evaluation ... Avg Reward 2.10 | Avg Step 5.00 Saving new model with eval reward 2.100 6500 / 10000 eps 0.010 | avgR 2.046 | Loss 4.916e-06 | Grad 9.225e-04 | EvalR 2.100 7000 / 10000 eps 0.010 | avgR 2.050 | Loss 1.268e-03 | Grad 1.982e-02 | EvalR 2.100 7500 / 10000 eps 0.010 | avgR 2.068 | Loss 1.600e-04 | Grad 8.429e-03 | EvalR 2.100 8000 / 10000 eps 0.010 | avgR 2.079 | Loss 2.913e-05 | Grad 1.524e-03 | EvalR 2.100 Evaluation ... Avg Reward 2.10 | Avg Step 5.00 8500 / 10000 eps 0.010 | avgR 2.057 | Loss 3.345e-06 | Grad 1.367e-03 | EvalR 2.100 9000 / 10000 eps 0.010 | avgR 2.058 | Loss 4.569e-05 | Grad 4.637e-03 | EvalR 2.100 9500 / 10000 eps 0.010 | avgR 2.055 | Loss 3.858e-06 | Grad 1.578e-03 | EvalR 2.100 10000 / 10000 eps 0.010 | avgR 2.055 | Loss 7.535e-05 | Grad 2.232e-03 | EvalR 2.100 Restore model with eval reward 2.100 Test Summary: | Pass Total Time double Q DQN | 1 1 5.8s 500 / 10000 eps 0.901 | avgR -0.135 | Loss 9.973e-02 | Grad 7.339e-02 | EvalR -Inf 1000 / 10000 eps 0.802 | avgR 0.072 | Loss 1.066e-01 | Grad 9.531e-02 | EvalR -Inf 1500 / 10000 eps 0.703 | avgR 0.354 | Loss 2.052e-02 | Grad 1.346e-01 | EvalR -Inf 2000 / 10000 eps 0.604 | avgR 0.689 | Loss 2.085e-02 | Grad 2.960e-02 | EvalR -Inf Evaluation ... Avg Reward 2.00 | Avg Step 5.00 2500 / 10000 eps 0.505 | avgR 0.262 | Loss 2.363e-02 | Grad 1.648e-02 | EvalR 2.000 3000 / 10000 eps 0.406 | avgR 1.017 | Loss 6.526e-02 | Grad 4.486e-02 | EvalR 2.000 3500 / 10000 eps 0.307 | avgR 0.661 | Loss 2.148e-02 | Grad 1.425e-01 | EvalR 2.000 4000 / 10000 eps 0.208 | avgR 1.588 | Loss 9.481e-02 | Grad 6.224e-02 | EvalR 2.000 Evaluation ... Avg Reward 2.10 | Avg Step 5.00 Saving new model with eval reward 2.100 4500 / 10000 eps 0.109 | avgR 1.813 | Loss 1.909e-02 | Grad 2.989e-02 | EvalR 2.100 5000 / 10000 eps 0.010 | avgR 2.005 | Loss 2.848e-04 | Grad 8.020e-03 | EvalR 2.100 5500 / 10000 eps 0.010 | avgR 2.074 | Loss 3.339e-05 | Grad 3.163e-03 | EvalR 2.100 6000 / 10000 eps 0.010 | avgR 2.058 | Loss 3.957e-05 | Grad 4.274e-03 | EvalR 2.100 Evaluation ... Avg Reward 2.10 | Avg Step 5.00 Saving new model with eval reward 2.100 6500 / 10000 eps 0.010 | avgR 2.077 | Loss 2.744e-06 | Grad 1.633e-03 | EvalR 2.100 7000 / 10000 eps 0.010 | avgR 2.068 | Loss 5.152e-05 | Grad 6.219e-03 | EvalR 2.100 7500 / 10000 eps 0.010 | avgR 2.036 | Loss 5.032e-05 | Grad 3.266e-03 | EvalR 2.100 8000 / 10000 eps 0.010 | avgR 2.044 | Loss 1.809e-04 | Grad 1.197e-02 | EvalR 2.100 Evaluation ... Avg Reward 2.10 | Avg Step 5.00 8500 / 10000 eps 0.010 | avgR 2.076 | Loss 1.384e-04 | Grad 5.471e-03 | EvalR 2.100 9000 / 10000 eps 0.010 | avgR 2.060 | Loss 3.189e-06 | Grad 2.263e-03 | EvalR 2.100 9500 / 10000 eps 0.010 | avgR 2.029 | Loss 9.793e-06 | Grad 2.356e-03 | EvalR 2.100 10000 / 10000 eps 0.010 | avgR 2.070 | Loss 4.190e-05 | Grad 5.168e-03 | EvalR 2.100 Restore model with eval reward 2.100 Test Summary: | Pass Total Time dueling DQN | 1 1 21.3s 500 / 10000 eps 0.901 | avgR 0.009 | Loss 1.334e-01 | Grad 1.650e-01 | EvalR -Inf 1000 / 10000 eps 0.802 | avgR 0.105 | Loss 3.002e-02 | Grad 1.995e-02 | EvalR -Inf 1500 / 10000 eps 0.703 | avgR 0.435 | Loss 2.069e-02 | Grad 4.219e-02 | EvalR -Inf 2000 / 10000 eps 0.604 | avgR 0.564 | Loss 1.464e-02 | Grad 1.283e-02 | EvalR -Inf Evaluation ... Avg Reward 1.90 | Avg Step 5.00 2500 / 10000 eps 0.505 | avgR 0.727 | Loss 3.614e-02 | Grad 6.802e-02 | EvalR 1.900 3000 / 10000 eps 0.406 | avgR 0.975 | Loss 4.116e-02 | Grad 1.370e-01 | EvalR 1.900 3500 / 10000 eps 0.307 | avgR 1.199 | Loss 9.364e-03 | Grad 2.574e-02 | EvalR 1.900 4000 / 10000 eps 0.208 | avgR 1.338 | Loss 1.907e-02 | Grad 1.809e-02 | EvalR 1.900 Evaluation ... Avg Reward 2.10 | Avg Step 5.00 Saving new model with eval reward 2.100 4500 / 10000 eps 0.109 | avgR 1.808 | Loss 5.820e-03 | Grad 2.542e-02 | EvalR 2.100 5000 / 10000 eps 0.010 | avgR 1.951 | Loss 4.262e-03 | Grad 1.617e-02 | EvalR 2.100 5500 / 10000 eps 0.010 | avgR 2.072 | Loss 4.240e-04 | Grad 1.827e-02 | EvalR 2.100 6000 / 10000 eps 0.010 | avgR 1.877 | Loss 6.181e-03 | Grad 1.030e-02 | EvalR 2.100 Evaluation ... Avg Reward 2.10 | Avg Step 5.00 Saving new model with eval reward 2.100 6500 / 10000 eps 0.010 | avgR 2.051 | Loss 1.008e-02 | Grad 9.425e-03 | EvalR 2.100 7000 / 10000 eps 0.010 | avgR 2.010 | Loss 3.438e-04 | Grad 1.301e-02 | EvalR 2.100 7500 / 10000 eps 0.010 | avgR 2.070 | Loss 5.226e-05 | Grad 1.905e-03 | EvalR 2.100 8000 / 10000 eps 0.010 | avgR 2.028 | Loss 2.939e-04 | Grad 4.258e-03 | EvalR 2.100 Evaluation ... Avg Reward 2.10 | Avg Step 5.00 8500 / 10000 eps 0.010 | avgR 2.049 | Loss 9.601e-05 | Grad 4.591e-03 | EvalR 2.100 9000 / 10000 eps 0.010 | avgR 2.075 | Loss 9.902e-05 | Grad 3.615e-03 | EvalR 2.100 9500 / 10000 eps 0.010 | avgR 2.046 | Loss 1.777e-04 | Grad 9.077e-03 | EvalR 2.100 10000 / 10000 eps 0.010 | avgR 2.077 | Loss 9.857e-04 | Grad 1.261e-03 | EvalR 2.100 Restore model with eval reward 2.100 Test Summary: | Pass Total Time Prioritized DDQN | 1 1 6.9s 500 / 10000 eps 0.901 | avgR 0.092 | Loss 1.474e-03 | Grad 1.169e-03 | EvalR -Inf 1000 / 10000 eps 0.802 | avgR 0.130 | Loss 6.859e-04 | Grad 1.462e-03 | EvalR -Inf 1500 / 10000 eps 0.703 | avgR 0.334 | Loss 6.310e-04 | Grad 7.830e-04 | EvalR -Inf 2000 / 10000 eps 0.604 | avgR 0.406 | Loss 8.951e-04 | Grad 1.855e-03 | EvalR -Inf Evaluation ... Avg Reward 2.10 | Avg Step 5.00 2500 / 10000 eps 0.505 | avgR 0.733 | Loss 1.719e-04 | Grad 2.936e-04 | EvalR 2.100 3000 / 10000 eps 0.406 | avgR 0.896 | Loss 1.049e-03 | Grad 1.755e-03 | EvalR 2.100 3500 / 10000 eps 0.307 | avgR 1.104 | Loss 4.911e-04 | Grad 8.762e-04 | EvalR 2.100 4000 / 10000 eps 0.208 | avgR 1.170 | Loss 2.652e-04 | Grad 1.226e-03 | EvalR 2.100 Evaluation ... Avg Reward 2.10 | Avg Step 5.00 Saving new model with eval reward 2.100 4500 / 10000 eps 0.109 | avgR 1.637 | Loss 5.639e-04 | Grad 8.385e-04 | EvalR 2.100 5000 / 10000 eps 0.010 | avgR 1.971 | Loss 2.598e-04 | Grad 3.134e-04 | EvalR 2.100 5500 / 10000 eps 0.010 | avgR 2.048 | Loss 1.248e-04 | Grad 3.742e-04 | EvalR 2.100 6000 / 10000 eps 0.010 | avgR 2.073 | Loss 2.100e-04 | Grad 7.388e-04 | EvalR 2.100 Evaluation ... Avg Reward 2.10 | Avg Step 5.00 Saving new model with eval reward 2.100 6500 / 10000 eps 0.010 | avgR 2.049 | Loss 1.675e-04 | Grad 7.924e-04 | EvalR 2.100 7000 / 10000 eps 0.010 | avgR 2.050 | Loss 1.796e-04 | Grad 1.203e-03 | EvalR 2.100 7500 / 10000 eps 0.010 | avgR 2.037 | Loss 6.206e-05 | Grad 3.970e-04 | EvalR 2.100 8000 / 10000 eps 0.010 | avgR 2.079 | Loss 2.688e-05 | Grad 2.486e-04 | EvalR 2.100 Evaluation ... Avg Reward 2.10 | Avg Step 5.00 8500 / 10000 eps 0.010 | avgR 2.051 | Loss 2.111e-06 | Grad 1.664e-04 | EvalR 2.100 9000 / 10000 eps 0.010 | avgR 2.071 | Loss 4.464e-07 | Grad 1.482e-04 | EvalR 2.100 9500 / 10000 eps 0.010 | avgR 2.075 | Loss 1.080e-06 | Grad 7.032e-05 | EvalR 2.100 10000 / 10000 eps 0.010 | avgR 2.055 | Loss 1.266e-06 | Grad 6.352e-05 | EvalR 2.100 Restore model with eval reward 2.100 Test Summary: | Pass Total Time TestMDP DRQN | 1 1 1m44.6s 500 / 10000 eps 0.901 | avgR 0.067 | Loss 6.218e-02 | Grad 4.526e-02 | EvalR -Inf Evaluation ... Avg Reward 3.47 | Avg Step 61.09 1000 / 10000 eps 0.802 | avgR -0.333 | Loss 9.368e-02 | Grad 7.466e-03 | EvalR 3.470 Evaluation ... Avg Reward -2.47 | Avg Step 40.09 1500 / 10000 eps 0.703 | avgR 0.410 | Loss 2.766e-02 | Grad 4.668e-03 | EvalR -2.470 Evaluation ... Avg Reward 2.57 | Avg Step 54.35 2000 / 10000 eps 0.604 | avgR 0.611 | Loss 6.230e-02 | Grad 6.823e-03 | EvalR 2.570 Evaluation ... Avg Reward 3.81 | Avg Step 36.02 2500 / 10000 eps 0.505 | avgR 1.527 | Loss 2.172e-02 | Grad 6.205e-03 | EvalR 3.810 Evaluation ... Avg Reward 4.06 | Avg Step 44.72 3000 / 10000 eps 0.406 | avgR 1.432 | Loss 5.789e-02 | Grad 3.071e-03 | EvalR 4.060 Evaluation ... Avg Reward 1.28 | Avg Step 56.99 Saving new model with eval reward 1.280 3500 / 10000 eps 0.307 | avgR 1.816 | Loss 9.177e-02 | Grad 9.783e-03 | EvalR 1.280 Evaluation ... Avg Reward 4.29 | Avg Step 43.90 4000 / 10000 eps 0.208 | avgR 2.157 | Loss 5.079e-02 | Grad 6.382e-03 | EvalR 4.290 Evaluation ... Avg Reward 3.90 | Avg Step 43.25 4500 / 10000 eps 0.109 | avgR 2.275 | Loss 3.248e-02 | Grad 7.243e-03 | EvalR 3.900 Evaluation ... Avg Reward 3.22 | Avg Step 46.90 5000 / 10000 eps 0.010 | avgR 2.441 | Loss 1.089e-01 | Grad 5.396e-02 | EvalR 3.220 Evaluation ... Avg Reward 4.59 | Avg Step 49.25 5500 / 10000 eps 0.010 | avgR 2.618 | Loss 8.642e-02 | Grad 1.164e-02 | EvalR 4.590 Evaluation ... Avg Reward 3.59 | Avg Step 40.85 6000 / 10000 eps 0.010 | avgR 2.490 | Loss 2.606e-02 | Grad 1.686e-02 | EvalR 3.590 Evaluation ... Avg Reward 3.59 | Avg Step 55.80 Saving new model with eval reward 3.590 6500 / 10000 eps 0.010 | avgR 2.755 | Loss 7.095e-02 | Grad 1.323e-02 | EvalR 3.590 Evaluation ... Avg Reward 3.95 | Avg Step 47.49 7000 / 10000 eps 0.010 | avgR 2.265 | Loss 1.680e-01 | Grad 1.338e-02 | EvalR 3.950 Evaluation ... Avg Reward 3.66 | Avg Step 53.83 7500 / 10000 eps 0.010 | avgR 2.549 | Loss 2.872e-02 | Grad 4.002e-02 | EvalR 3.660 Evaluation ... Avg Reward 4.52 | Avg Step 47.85 8000 / 10000 eps 0.010 | avgR 2.863 | Loss 5.626e-02 | Grad 6.115e-03 | EvalR 4.520 Evaluation ... Avg Reward 6.06 | Avg Step 41.42 8500 / 10000 eps 0.010 | avgR 2.716 | Loss 8.934e-02 | Grad 1.018e-02 | EvalR 6.060 Evaluation ... Avg Reward 5.59 | Avg Step 53.52 9000 / 10000 eps 0.010 | avgR 3.343 | Loss 2.660e-02 | Grad 1.469e-02 | EvalR 5.590 Evaluation ... Avg Reward 6.55 | Avg Step 34.18 Saving new model with eval reward 6.550 9500 / 10000 eps 0.010 | avgR 3.412 | Loss 9.244e-03 | Grad 2.433e-02 | EvalR 6.550 Evaluation ... Avg Reward 5.30 | Avg Step 47.58 10000 / 10000 eps 0.010 | avgR 3.725 | Loss 5.098e-02 | Grad 6.172e-03 | EvalR 5.300 Restore model with eval reward 6.550 Test Summary: | Pass Total Time GridWorld DDRQN | 1 1 57.7s 500 / 10000 eps 0.901 | avgR -24.785 | Loss 8.155e-03 | Grad 1.009e-02 | EvalR -Inf Evaluation ... Avg Reward 1.01 | Avg Step 101.00 1000 / 10000 eps 0.802 | avgR -25.159 | Loss 1.681e-02 | Grad 1.288e-02 | EvalR 1.010 Evaluation ... Avg Reward 1.01 | Avg Step 101.00 1500 / 10000 eps 0.703 | avgR -23.507 | Loss 2.821e-03 | Grad 1.128e-03 | EvalR 1.010 Evaluation ... Avg Reward 1.01 | Avg Step 101.00 2000 / 10000 eps 0.604 | avgR -22.784 | Loss 1.088e-02 | Grad 2.641e-03 | EvalR 1.010 Evaluation ... Avg Reward 1.01 | Avg Step 101.00 2500 / 10000 eps 0.505 | avgR -21.343 | Loss 4.022e-03 | Grad 4.390e-03 | EvalR 1.010 Evaluation ... Avg Reward 1.01 | Avg Step 101.00 3000 / 10000 eps 0.406 | avgR -19.978 | Loss 1.000e-02 | Grad 1.656e-02 | EvalR 1.010 Evaluation ... Avg Reward 1.01 | Avg Step 101.00 Saving new model with eval reward 1.010 3500 / 10000 eps 0.307 | avgR -18.654 | Loss 1.140e-02 | Grad 4.508e-03 | EvalR 1.010 Evaluation ... Avg Reward 1.01 | Avg Step 101.00 4000 / 10000 eps 0.208 | avgR -17.365 | Loss 3.913e-03 | Grad 5.815e-03 | EvalR 1.010 Evaluation ... Avg Reward 1.01 | Avg Step 101.00 4500 / 10000 eps 0.109 | avgR -15.899 | Loss 4.547e-03 | Grad 2.708e-03 | EvalR 1.010 Evaluation ... Avg Reward 1.01 | Avg Step 101.00 5000 / 10000 eps 0.010 | avgR -14.406 | Loss 8.237e-03 | Grad 9.793e-03 | EvalR 1.010 Evaluation ... Avg Reward 1.01 | Avg Step 101.00 5500 / 10000 eps 0.010 | avgR -13.027 | Loss 1.487e-02 | Grad 7.135e-03 | EvalR 1.010 Evaluation ... Avg Reward 1.01 | Avg Step 101.00 6000 / 10000 eps 0.010 | avgR -11.909 | Loss 5.312e-03 | Grad 3.140e-03 | EvalR 1.010 Evaluation ... Avg Reward 1.01 | Avg Step 101.00 Saving new model with eval reward 1.010 6500 / 10000 eps 0.010 | avgR -10.930 | Loss 3.772e-03 | Grad 9.426e-03 | EvalR 1.010 Evaluation ... Avg Reward 1.01 | Avg Step 101.00 7000 / 10000 eps 0.010 | avgR -10.101 | Loss 8.016e-03 | Grad 5.970e-03 | EvalR 1.010 Evaluation ... Avg Reward 1.01 | Avg Step 101.00 7500 / 10000 eps 0.010 | avgR -9.409 | Loss 6.667e-03 | Grad 5.933e-03 | EvalR 1.010 Evaluation ... Avg Reward 1.01 | Avg Step 101.00 8000 / 10000 eps 0.010 | avgR -8.777 | Loss 6.796e-03 | Grad 1.313e-02 | EvalR 1.010 Evaluation ... Avg Reward 1.01 | Avg Step 101.00 8500 / 10000 eps 0.010 | avgR -8.229 | Loss 5.708e-03 | Grad 1.064e-02 | EvalR 1.010 Evaluation ... Avg Reward 1.01 | Avg Step 101.00 9000 / 10000 eps 0.010 | avgR -7.733 | Loss 4.396e-03 | Grad 5.161e-03 | EvalR 1.010 Evaluation ... Avg Reward 1.01 | Avg Step 101.00 Saving new model with eval reward 1.010 9500 / 10000 eps 0.010 | avgR -7.299 | Loss 5.992e-03 | Grad 1.106e-02 | EvalR 1.010 Evaluation ... Avg Reward 1.01 | Avg Step 101.00 10000 / 10000 eps 0.010 | avgR -6.908 | Loss 5.871e-03 | Grad 5.920e-03 | EvalR 1.010 Restore model with eval reward 1.010 Test Summary: | Pass Total Time TigerPOMDP DDRQN | 1 1 26.9s Test Summary: | Pass Total Time Static Array Env | 1 1 7.3s here Test Summary: | Pass Total Time Common RL Env | 1 1 4.6s 500 / 10000 eps 0.901 | avgR -2.714 | Loss 1.351e+00 | Grad 9.667e-01 | EvalR -Inf Evaluation ... Avg Reward -0.37 | Avg Step 67.43 1000 / 10000 eps 0.802 | avgR -0.778 | Loss 6.873e-01 | Grad 5.626e-01 | EvalR -0.370 Evaluation ... Avg Reward -2.46 | Avg Step 54.69 1500 / 10000 eps 0.703 | avgR -1.048 | Loss 8.471e-01 | Grad 7.711e-01 | EvalR -2.460 Evaluation ... Avg Reward -0.05 | Avg Step 78.02 2000 / 10000 eps 0.604 | avgR -0.039 | Loss 5.714e-01 | Grad 1.072e+00 | EvalR -0.050 Evaluation ... Avg Reward 0.20 | Avg Step 85.88 2500 / 10000 eps 0.505 | avgR 0.328 | Loss 5.354e-01 | Grad 5.854e-01 | EvalR 0.200 Evaluation ... Avg Reward -0.40 | Avg Step 65.73 3000 / 10000 eps 0.406 | avgR 0.423 | Loss 7.217e-01 | Grad 2.365e-01 | EvalR -0.400 Evaluation ... Avg Reward 1.30 | Avg Step 74.52 Saving new model with eval reward 1.300 3500 / 10000 eps 0.307 | avgR 0.423 | Loss 6.663e-01 | Grad 3.810e-01 | EvalR 1.300 Evaluation ... Avg Reward 3.47 | Avg Step 53.62 4000 / 10000 eps 0.208 | avgR 0.451 | Loss 7.987e-01 | Grad 1.826e-01 | EvalR 3.470 Evaluation ... Avg Reward 1.07 | Avg Step 59.74 4500 / 10000 eps 0.109 | avgR 0.735 | Loss 6.002e-01 | Grad 3.061e-01 | EvalR 1.070 Evaluation ... Avg Reward 4.20 | Avg Step 48.96 5000 / 10000 eps 0.010 | avgR 1.275 | Loss 3.696e-01 | Grad 7.500e-01 | EvalR 4.200 Evaluation ... Avg Reward 0.10 | Avg Step 58.45 5500 / 10000 eps 0.010 | avgR 1.637 | Loss 4.279e-01 | Grad 4.859e-01 | EvalR 0.100 Evaluation ... Avg Reward 0.41 | Avg Step 59.57 6000 / 10000 eps 0.010 | avgR 1.539 | Loss 3.400e-01 | Grad 6.315e-01 | EvalR 0.410 Evaluation ... Avg Reward 0.63 | Avg Step 72.49 6500 / 10000 eps 0.010 | avgR 1.725 | Loss 1.918e-01 | Grad 2.185e-01 | EvalR 0.630 Evaluation ... Avg Reward 5.86 | Avg Step 31.20 7000 / 10000 eps 0.010 | avgR 1.510 | Loss 5.539e-01 | Grad 1.613e-01 | EvalR 5.860 Evaluation ... Avg Reward 2.54 | Avg Step 68.09 7500 / 10000 eps 0.010 | avgR 1.784 | Loss 8.510e-01 | Grad 7.759e-02 | EvalR 2.540 Evaluation ... Avg Reward 0.41 | Avg Step 74.58 8000 / 10000 eps 0.010 | avgR 1.520 | Loss 6.058e-01 | Grad 3.073e-01 | EvalR 0.410 Evaluation ... Avg Reward 5.27 | Avg Step 19.24 8500 / 10000 eps 0.010 | avgR 1.980 | Loss 8.568e-01 | Grad 6.558e-02 | EvalR 5.270 Evaluation ... Avg Reward 3.03 | Avg Step 49.34 9000 / 10000 eps 0.010 | avgR 2.480 | Loss 5.606e-01 | Grad 3.068e-01 | EvalR 3.030 Evaluation ... Avg Reward 2.06 | Avg Step 70.66 Saving new model with eval reward 2.060 9500 / 10000 eps 0.010 | avgR 1.990 | Loss 4.283e-01 | Grad 1.182e-01 | EvalR 2.060 Evaluation ... Avg Reward -1.28 | Avg Step 62.57 10000 / 10000 eps 0.010 | avgR 2.069 | Loss 4.761e-01 | Grad 4.951e-01 | EvalR -1.280 Restore model with eval reward 2.060 Total discounted reward for 1 simulation: 0.0 500 / 10000 eps 0.901 | avgR -2.714 | Loss 1.351e+00 | Grad 9.667e-01 | EvalR -Inf Evaluation ... Avg Reward -0.37 | Avg Step 67.43 1000 / 10000 eps 0.802 | avgR -0.778 | Loss 6.873e-01 | Grad 5.626e-01 | EvalR -0.370 Evaluation ... Avg Reward -2.46 | Avg Step 54.69 1500 / 10000 eps 0.703 | avgR -1.048 | Loss 8.471e-01 | Grad 7.711e-01 | EvalR -2.460 Evaluation ... Avg Reward -0.05 | Avg Step 78.02 2000 / 10000 eps 0.604 | avgR -0.039 | Loss 5.714e-01 | Grad 1.072e+00 | EvalR -0.050 Evaluation ... Avg Reward 0.20 | Avg Step 85.88 2500 / 10000 eps 0.505 | avgR 0.328 | Loss 5.354e-01 | Grad 5.854e-01 | EvalR 0.200 Evaluation ... Avg Reward -0.40 | Avg Step 65.73 3000 / 10000 eps 0.406 | avgR 0.423 | Loss 7.217e-01 | Grad 2.365e-01 | EvalR -0.400 Evaluation ... Avg Reward 1.30 | Avg Step 74.52 Saving new model with eval reward 1.300 3500 / 10000 eps 0.307 | avgR 0.423 | Loss 6.663e-01 | Grad 3.810e-01 | EvalR 1.300 Evaluation ... Avg Reward 3.47 | Avg Step 53.62 4000 / 10000 eps 0.208 | avgR 0.451 | Loss 7.987e-01 | Grad 1.826e-01 | EvalR 3.470 Evaluation ... Avg Reward 1.07 | Avg Step 59.74 4500 / 10000 eps 0.109 | avgR 0.735 | Loss 6.002e-01 | Grad 3.061e-01 | EvalR 1.070 Evaluation ... Avg Reward 4.20 | Avg Step 48.96 5000 / 10000 eps 0.010 | avgR 1.275 | Loss 3.696e-01 | Grad 7.500e-01 | EvalR 4.200 Evaluation ... Avg Reward 0.10 | Avg Step 58.45 5500 / 10000 eps 0.010 | avgR 1.637 | Loss 4.279e-01 | Grad 4.859e-01 | EvalR 0.100 Evaluation ... Avg Reward 0.41 | Avg Step 59.57 6000 / 10000 eps 0.010 | avgR 1.539 | Loss 3.400e-01 | Grad 6.315e-01 | EvalR 0.410 Evaluation ... Avg Reward 0.63 | Avg Step 72.49 6500 / 10000 eps 0.010 | avgR 1.725 | Loss 1.918e-01 | Grad 2.185e-01 | EvalR 0.630 Evaluation ... Avg Reward 5.86 | Avg Step 31.20 7000 / 10000 eps 0.010 | avgR 1.510 | Loss 5.539e-01 | Grad 1.613e-01 | EvalR 5.860 Evaluation ... Avg Reward 2.54 | Avg Step 68.09 7500 / 10000 eps 0.010 | avgR 1.784 | Loss 8.510e-01 | Grad 7.759e-02 | EvalR 2.540 Evaluation ... Avg Reward 0.41 | Avg Step 74.58 8000 / 10000 eps 0.010 | avgR 1.520 | Loss 6.058e-01 | Grad 3.073e-01 | EvalR 0.410 Evaluation ... Avg Reward 5.27 | Avg Step 19.24 8500 / 10000 eps 0.010 | avgR 1.980 | Loss 8.568e-01 | Grad 6.558e-02 | EvalR 5.270 Evaluation ... Avg Reward 3.03 | Avg Step 49.34 9000 / 10000 eps 0.010 | avgR 2.480 | Loss 5.606e-01 | Grad 3.068e-01 | EvalR 3.030 Evaluation ... Avg Reward 2.06 | Avg Step 70.66 Saving new model with eval reward 2.060 9500 / 10000 eps 0.010 | avgR 1.990 | Loss 4.283e-01 | Grad 1.182e-01 | EvalR 2.060 Evaluation ... Avg Reward -1.28 | Avg Step 62.57 10000 / 10000 eps 0.010 | avgR 2.069 | Loss 4.761e-01 | Grad 4.951e-01 | EvalR -1.280 Restore model with eval reward 2.060 Test Summary: | Total Time README Examples | 0 11.7s Testing DeepQLearning tests passed Testing completed after 491.17s PkgEval succeeded after 999.39s