Saturday, March 12, 2016

Configuring log4j on weblogic server for web applications.

To configure Weblogic server :
  • Go to WL_HOME/server/lib and copy wllog4j.jar to the server CLASSPATH, to do this copy the file into DOMAIN_NAME/lib
  • Download log4j jar (in my case I had not the file) from http://logging.apache.org/log4j/1.2/download.html , in this case the last available version is log4j-1.2.17.jar, and copy the file into DOMAIN_NAME/lib (As step 2).
  • In this case I activate log4j using WLST (Weblogic Scripting Tool), as bellow :
  • As you're using windows, execute a terminal window and go to DOMAIN_NAME/bin and run the file setDomainEnv.cmd (this file will set the environment to run java).
  • Execute the following comands :

                                        C:\>java weblogic.WLST
                                       wls:/offline> connect('username','password')
                                       wls:/mydomain/serverConfig> edit()
                                       wls:/mydomain/edit> startEdit()
                                       wls:/mydomain/edit !> cd("Servers/$YOUR_SERVER_NAME/Log/$YOUR_SERVER_NAME";)
                                       wls:/mydomain/edit/Servers/myserver/Log/myserver !> cmo.setLog4jLoggingEnabled(true)
                                       wls:/mydomain/edit/Servers/myserver/Log/myserver !> save()
                                       wls:/mydomain/edit/Servers/myserver/Log/myserver !> activate()
  • Use ls() to list the objects under the WLS directory
  • This will activate log4j to use it with WLS.


To configure applications :
  • Create a log4j.properties file as bellow

                                  log4j.debug=TRUE
                                  log4j.rootLogger=INFO, R
                                  log4j.appender.R=org.apache.log4j.RollingFileAppender
                                  log4j.appender.R.File=/home/server.log
                                  log4j.appender.R.MaxFileSize=100KB
                                  log4j.appender.R.MaxBackupIndex=5
                                  log4j.appender.R.layout=org.apache.log4j.PatternLayout
                                  log4j.appender.R.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss.SSSS} %p %t %c – %m%n
  • Copy the file to /WEB-INF/classes directory. of your application.
  • Implement also the last action provided to activate log4j on WLS


OutofMemory on PermGen When deploying an application

Issue: while deploying application we are start getting permgen error very frequently

Root cause of OOM on perm gen when deploying an application:

Interned java.lang.String objects are also stored in the permanent generation. The java.lang.String class maintains a pool of strings. When the intern method is invoked, the method checks the pool to see if an equal string is already in the pool. If there is, then the intern method returns it; otherwise it adds the string to the pool. In more precise terms, the java.lang.String.intern method is used to obtain the canonical representation of the string; the result is a reference to the same class instance that would be returned if that string appeared as a literal. If an application interns a huge number of strings, the permanent generation might need to be increased from its default setting.

When this kind of error occurs, the text String.intern or ClassLoader.defineClass might appear near the top of the stack trace that is printed. This is the key here as all OOM are throwing messages of this kind.

Solution: Increase perm gen size to a much greater number, let's say 1024m and test.

How to use jconsole to manage JMX on weblogic


  • Start the Jconsole as stated below (Jconsole and weblogic.jar path are environment specific):
  • C:\jdk1.6.0_45\bin>jconsole -J-Djava.class.path=C:\jdk1.6.0_45\lib\jconsole.jar;C:\jdk1.6.0_45\lib\tools.jar;C:\Oracle\Middleware\wlserver_10.3\server\lib\weblogic.jar -J-Djmx.remote.protocol.provider.pkgs=weblogic.management.remote -debug
  • Once Jconsole screen comes, connect to Runtime Mbean Service as:
  • Provide URL in local section as: service:jmx:iiop://localhost:7001/jndi/weblogic.management.mbeanservers.runtime
  • Provide weblogic username & password 
       Note: change host and port as per server ListenAddress and ListenPort. 

WebGate Generates Error "Failure to connect to Access Server"

After configuring WebGate 11g with Oracle Access Manager (OAM) 11g Server, attempts to access any page using the WebGate webserver hostname and port fail with either HTTP-500 Internal Server Error or message "The AccessGate is unable to contact any Access Servers".

Example WebGate oblog.log message:
ACCESS_GATE ERROR ade/aime_ngamac_497961/ngamac/src/palantir/webgate2/src/isprotected.cpp: "Failure to connect to Access Server" HTTPStatus^500 Error^The Access Server has returned a fatal error with no detailed information.

What is causing the error:

The error indicates that the WebGate is unable to communicate with the OAM Server.
WebGate uses it's agent artifact files to know which OAM Server host and port to connection to, using which agent password, and in what communication mode.
Artifact files are:
ObAccessClient.xml                  --  WebGate agent configuration file generated/updated via OAM Console or RREG, should never be updated directly
cwallet.sso                                  --   11g WebGate only, agent key
password.xml                              --   Simple/Cert mode only, contains Simple mode passphrase or Cert mode agent key
aaa_key.pem                               --   Simple/Cert mode only, contains the WebGate certificate key
aaa_cert.pem                              --   Simple/Cert mode only, contains the WebGate Simple/Cert mode certificate
aaa_chain.pem                            --  Cert mode only, contains the root Certificate Authority certificate, and sub-CA certificates if applicable
  • It could be that the agent artifacts have not been copied over from the OAM Server to the WebGate configuration directory after agent registration, or not all files have been copied.
  • The agent password may have been changed in the agent configuration but the modified artifact files not been copied over to the webgate configuration directory.
  •  The communication mode of the agent has been changed but the new artifacts and certificates have not been copied over.
  • Any discrepancy between the configuration of the agent in OAM Console and the artifacts in the webgate configuration directory will cause this communication failure.
  • If there is any SSL handshake failure due to missing root CA certificates the communication will fail.

Solutions: 
  • To resolve, try copying over all the necessary artifacts for the selected agent communication mode from the OAM Server to webgate configuration directory again, and restart the WebGate webserver.
  •  Delete and re-register the agent, then copy over the new artifacts. 

Enable TRACE logging for 11g WebGate .

WEBGATE TRACE INSTRUCTIONS
  • To enable TRACE logging for 11g WebGate:
  • First backup file ORACLE_INSTANCE/config/OHS/OHS_INSTANCE_NAME/webgate/config/oblog_config_wg.xml
  • Then in the original oblog_config_wg.xml set:
                               <SimpleList>
<NameValPair
ParamName="LOG_THRESHOLD_LEVEL"
Value="LOGLEVEL_TRACE"></NameValPair>
</SimpleList>

                             NOTE: Do not modify any other LOGLEVEL settings in that file.

  • Change the BUFFER_SIZE in the oblog config file, so that log entries are flushed to file promptly.
<NameValPair
ParamName="BUFFER_SIZE"
Value="4"></NameValPair>

                           Note:WebGate webserver restart is not necessary.

  • To disable the TRACE logging simply replace the original oblog_config_wg.xml file.



Increase logging level on the OAM Server to Trace 32 from OAM console.

Increase logging level on the OAM Server to Trace 32 from OAM console.

OAM SERVER TRACE 32 INSTRUCTIONS
How to increase the logging for the {odl-handler} to Trace:32, which would then add more details in the {oam managed server}-diagnostic.log file.

To set this up
1)run the /em console
2) Expand the Farm_base_domain
3) Expand Identity and Access
4) Expand OAM
5) Right click on oam_server
6) Click on Logs -> Log Configuration
7) On the Log Files tab, click on odl-handler to select it
8) Click Edit Configuration
9) Change the Logging Level to TRACE:32 //note the log level already set
10) Click OK
11) Click Close
12) On the Log Levels tab, use pull down menu next to Root Logger to change it to TRACE:32 (FINEST)
13) Click Apply
14) Click Yes
15) Click Close

Note the time
16) Recreate your issue and wait for your application to process.
17) Set the Logging Level back to previous and apply

Once Complete

18) Trace will be return in oam-diagnostic.log 

OPatch Failed with lock file exists in ORACLE_HOME

Retrying to patch apply after a failed attempted resulted in following error "Lock file left by a different patch, OPatch will not try re-using the lock file"

Solutions: 
1. Look in the oraInventory and see if there is lock directory if so remove it. 
/apps/oracle/oraInventory/locks

2. Inside the .patch_storage there will be another file call "patch_locked" which basically has the information for which patch the lock is taken for. Removing "patch_locked" file resolved the issue.
/apps/soa/fmw11.1.1.7/Oracle_SOA/.patch_storage/patch_locked


.
Error Trace:
/apps/soa/fmw/oracle_common/OPatch/opatch apply
Oracle Interim Patch Installer version 11.1.0.9.9
Copyright (c) 2012, Oracle Corporation.  All rights reserved.

Verifying environment and performing prerequisite checks...
[ Error during Oracle Home discovery Phase]. Detail: OPatchSession cannot load inventory for the given Oracle Home /opt/app/11.2.0/grid4. Possible causes are:
   No read or write permission to ORACLE_HOME/.patch_storage
   Central Inventory is locked by another OUI instance
   No read permission to Central Inventory
   The lock file exists in ORACLE_HOME/.patch_storage
   The Oracle Home does not exist in Central Inventory

OutofMemory on GetNEWTLA

Weblogic managed server failed sporadically with out of memory on TLA. After investigating the errors in the logs and research oracle forums I came to know that the Thread local Area ( TLA) value was set to default value 2KB , which was not sufficient hence we were encountering OOM on TLA.
For certain OS JVM handles the TLA  mem allocation dynamically but for some it is required to manually increase. This may require some experimentation as it is hard to say exactly how large you need this to be, you can slowly increase it until the error goes away.

Error:
####<Warning> <RMI> <[ACTIVE] ExecuteThread: '22' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <7b90471a4de3bc66:-511d663d:151a1e81621:-8000-000000000002f2bd> <1450407974931> <BEA-080004> <An error was thrown by rmi server: javax.management.remote.rmi.RMIConnectionImpl.queryNames(Ljavax.management.ObjectName;Ljava.rmi.MarshalledObject;Ljavax.security.auth.Subject;) 
java.lang.OutOfMemoryError: getNewTla. 
java.lang.OutOfMemoryError: getNewTla 
<Error> <RMI> <ExecuteThread: '2' for queue: 'weblogic.socket.Muxer'> <<WLS Kernel>> <> <b2ae77c08e5e739b:4d6de871:151889120ef:-8000-000000000002f0c8> <1449981645585> <BEA-080001> <Error in Dispatcher 
java.lang.OutOfMemoryError. 
java.lang.OutOfMemoryError 
####< <weblogic.cluster.MessageReceiver> <<WLS Kernel>> <> <b2ae77c08e5e739b:4d6de871:151889120ef:-8000-000000000002f0ca> <1449981646763> <BEA-003107> <Lost 1 unicast message(s).> 
####< <[ACTIVE] ExecuteThread: '25' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <b2ae77c08e5e739b:4d6de871:151889120ef:-8000-000000000002f0cb> <1449981649184> <BEA-002634> <The server "AdminServer" disconnected from this server.> 

Script: "setDomain.sh" : JAVA_OPTIONS="${JAVA_OPTIONS} ${JAVA_PROPERTIES} -Dwlw.iterativeDev=${iterativeDevFlag} -Dwlw.testConsole=${testConsoleFlag} -Dwlw.logErrorsToConsole=${logErrorsToConsoleFlag} -XXtlasize:min=8k,preferred=512k"
export JAVA_OPTIONS


***Use this option with caution, as changing the thread-local area size can have severe impact on performance.***

Specify <size> in bytes, using the normal K,M,G suffixes.

For example:
-XXtlasize:min=8k,preferred=512k
sets a TLA size suitable for heaps of several GB.

***Note: The old style of setting TLA size (that is, -XXtlasize=256k) is still supported but has been deprecated. If you use the old style, JRockit JVM will interpret the option as if the fixed parameter was used; for example, -XXtlasize=256k would be interpreted as -XXtlasize:fixed=256k.

Reference:
http://docs.oracle.com/cd/E13150_01/jrockit_jvm/jrockit/jrdocs/refman/optionXX.html 

Overview of Oracle Identity Federation

What exactly is OIF?
Basically it allows different entities to share their services using a global identity maintained by one of the organization.

So what exactly it means?
Take an example to understand the usage of OIF:
Let say a company CCC wants to use the service of a Insurance Company, so that CCC company employees can access Insurance Portal. For this the Insurance company should have the CCC company employee database, so that when the CCC company employees wants to use the Insurance Services they can be authenticate & authorize to do so.But CCC can’t share the database. 

So in this scenario how can Insurance Portal be able to become part of CCC Company?
The answer is Federation.

Thus here CCC Company using the OAM SSO for their employees, decides to enable the Federation feature. And the similar OAM setup needs to be done at the Insurance side.
So in this case CCC Company acts as Identity Provider (IdP) while the Insurance company as Service Provider (SP).

What are IdP & SP
Oracle Identity Federation supports two integration modes with Oracle Access Manager: authentication mode and SP mode.

Authentication Mode (IdP)
In the authentication mode, Oracle Identity Federation delegates authentication of the user to Oracle Access Manager. The user is redirected to an Oracle Identity Federation resource protected by WebGate that triggers the Oracle Access Manager Authentication flow. Once the user is identified, it will access the resource, and WebGate will provide to Oracle Identity Federation an HTTP header containing the user's identity.

SP Mode
In the SP mode, Oracle Access Manager delegates user authentication to Oracle Identity Federation, which uses the Federation Oracle Single Sign-On protocol with a remote Identity Provider. Once the Federation Oracle Single Sign-On flow is performed, Oracle Identity Federation will create a local session and then propagates the authentication state to Oracle Access Manager, which maintains the session information.

Use Case:
  • User accesses the CCC company portal, & hits the Insurance portal link. User is redirected to Insurance portal, where he is asked to enter his credentials.
  • User submits his credentials which are actually saved in the CCC company database. Thus Insurance site sends the credentials submitted by user to the CCC company using SAMLv2.0 token form.
  • CCC company replies in the SAMLv2.0 as well & Insurance portal reads the token returned by   CCC & based on the reply like user is valid & authorized or not. Insurance portal takes the  decision and makes the user to access the Insurance services based on that.


So in this way they get federated seamlessly.

Introduction to Application Domain and Policy Creation in OAM

http://docs.oracle.com/cd/E27559_01/admin.1112/e27239/app_domn.htm#AIAAG1854

Creating a user in OID

1. Login to OID: http://<host:port>/odsm
2. Navigate to "Data Browser" tab & expand tree "dc" => "cn=Users"
3. One way of creating a user is by using an existing ID,this way you don't have to provide all the details that are required for user creation instead carry the existing ID details for creating the new  user (assuming the new user falls under same group/dc)
4. Select the user entry that is existing that you would use to create the new user and Select the option 'Create a new entry like this one' icon under "Data Tree"
5. Pop-up window will display with "Parent of the entry" populated (This has been carried from existing user). Select next .
6. Provide 'cn' = user first name , 'sn" = user last name & choose the relative distinguished name as 'UID', This will automatically populate the "Distinguished Name". Select next.
7.On Optional Properties page, leave it to default details. 
8. Choose 'Finish' to complete the process.
9. New user is created, search for the user using search option under "Data Tree" to make sure the user has been created before informing the user.


Note: "relative distinguished name" selection will change depending on your environment. In some case it could be "CN"

Disabling the User Account in OID

Disabling the User Account in OID using ODSM.

1. Log in to OID: http://<host:port>/odsm
2. Navigate to "Data Browser" tab & expand the "dc" sub directory or alternatively Enter user id in Search option and hit search button.
3. Search will return the user entry under "Data Tree" panel
4. Select the user under "Data Tree" panel it will populate the user details page on the right hand pane.
5. Now we need to add an attribute that will disable the account. Open the 'Attributes' tab on the right pane.
6. We need to add an optional attributes 'orclIsEnabled'. Opt to add a attributes under "optional attributes" section.
7. It will display a pop-up window with differet attributes that are available,choose to add the 'orclIsEnabled' attribute, than do 'Add Attributes'.
8. Default value of  'orclIsEnabled' attribute is 'ENABLED' but we need to set it as 'DISABLED'
9  Once you set the attribute value, click 'Apply' on the top right.

Note: The value of field has to be set as 'DISABLED' if you set as 'FALSE' it won't work.

java.io.IOException: No locks

We were encountering problems with starting both nodemanager and WebLogic on a OID installation. Noticed below errors in the server logs and when I ran "/etc/init.d/nfslock status", I got a message that said "rpc.statd dead but pid file exists".

This error is usually caused by the NFS lockd not running or malfunctioning. Other NFS daemons not running/malfunctioning may cause similar errors too.

Engage a unix admin who can look at startup script for statd, and figuring out what it was checking for. Usually there could be some lock files (not pid file) which would be causing statd to basically abort starting up. Deleting the file and restarting nfslock would resolved the issue. Thereafter we were able to bring the nodemanager and weblogic fine.


<BEA-300033> <Could not execute command "getVersion" on the node manager. Reason: "I/O error while reading domain directory".>
<Error> <NodeManager> <BEA-300033> <Could not execute command "getVersion" on the node manager. Reason: "I/O error while reading domain directory".>
<Info> <ServletContext-/bea_wls_internal> <BEA-000000> <HTTPClntLogin: Login rejected with code: 'Failed', reason: 

java.net.ProtocolException: HTTP tunneling is disabled
        at weblogic.rjvm.http.HTTPServerJVMConnection.acceptJVMConnection(HTTPServerJVMConnection.java:84)
        at weblogic.rjvm.http.TunnelLoginServlet.service(TunnelLoginServlet.java:80)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:821)
        at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:227)
        at weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:125)
        at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:300)
        at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:184)
        at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.doIt(WebAppServletContext.java:3686)
        at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3650)
        at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
        at weblogic.security.service.SecurityManager.runAs(SecurityManager.java:121)
        at weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2268)
        at weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2174)
        at weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1446)
        at weblogic.work.ExecuteThread.execute(ExecuteThread.java:201)
        at weblogic.work.ExecuteThread.run(ExecuteThread.java:173)


Enable & Disable trace on OAM manged servers.

How to Enable & Disable trace on OAM manged servers using command line.

Enable Trace:32:
1. cd /local/apps/oracle/middleware/oracle_idm1/common/bin
2. connect()
3. adminserver url: t3://mytest.host.com:7001
4. domainRuntime()
5. setLogLevel(target='oam_1', logger='oracle.oam', level='TRACE:32', persist="0")
7. disconnect()
8. exit()

Disable Trace:32

1. Repeat steps 1 thru 5
2. setLogLevel(target='oam_1', logger="oracle.oam",level="", persist="1")
3. disconnect()
4. exit()

Different types of OAM/OES loggers: oracle.oam.engine ,oracle.security.am,oracle.jps.policymgm ,oracle.jps.authorization

Note: Enabling & Disabling the logs are dynamic no need to restart the managed servers.

ERROR: transport error 202: bind failed: Address already in use

Cause: Two managed server running on same physical server but different listen ports but using same debugging port.

ERROR: transport error 202: bind failed: Address already in use ERROR: JDWP Transport dt_socket failed to initialize, JDWP No transports initialized, jvmtiError=AGENT_ERROR_TRANSPORT_INIT(197) [ERROR] aborted JRockit aborted: Unknown error (50)

Solution:  Searching for debugFlag="true" in the default setDomainEnv.sh and change it to false. debugFlag="false"

setDomain: if [ "${DEBUG_PORT}" = "" ] ; then DEBUG_PORT="8080" export DEBUG_PORT fi if [ "${SERVER_NAME}" = "" ] ; then SERVER_NAME="tst_my_admin" export SERVER_NAME fi debugFlag="false" export debugFlag

Java.lang.ClassNotFoundException: oracle.jrf.wls.JRFStartup

Fresh installation of SOA and bringing up soa managed server caused lot of application failure. 

BEA-000286:Failed to invoke startup class "JRF Startup Class", java.lang.ClassNotFoundException: oracle.jrf.wls.JRFStartup , Trying to start SOA Server ,Using console(Nodemanager) Linux O.S : 

The problem has been noticed when attempting to start soa server from console and nodemanger is up. Not problem when starting it via managed startup scripts.

Cause: There is an issue with java.lang.ClassNotFoundException exceptions when starting domains with , This is due to the reason that StartScriptEnabled was set to false in the nodemanager.properties file.

 Solution : Open wl_home\wlserver_10.3\common\nodemanager\nodemanager.properties on the managed server machine and set StartScriptEnabled=true. Restart the node manager , shutdown managed servers and start them from console.


Friday, March 11, 2016

How to take heap dumps in Linux

To Identify what is holding the memory (why GC is not releasing the memory, There could be memory leak some where) take heap dump when this condition arises and fix the OutOfMemory situation.

Capture heap dump using jmap:

Use the following command to generate a heap dump when the server is at a critical memory level (reaching the limit for an out of memory). This should typically generate a very large file called heap.bin.

Syntax: /Java/jdk/bin/jmap -heap:format=b <pid>
Eg: jmap -heap:format=b 19096

<pid> is the process number of the WebLogic server instance


Capture heap dump when OOM occurs:

Enable flag "-XX:+HeapDumpOnOutOfMemoryError" in server startup script which will generate heapdump when OOM happens.

Eg: JAVA_OPTIONS="-verbose:gc -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -Xloggc:/local/apps/oracle/middleware/user_domains/test/bin/gc.txt -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/local/apps/oracle/middleware/user_domains/test/bin " ${JAVA_OPTIONS}"

Review the heap dump to under the objects that are occupying the memory and the frequency that they get generated until they actually reach JVM memory threshold.