Quantcast

Need some help on a problem with Pacemaker/corosync on fc12

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Need some help on a problem with Pacemaker/corosync on fc12

Moullé Alain
Hi,

I've retrieved the following rpms for fc12 :
cluster-glue-1.0.3-1.fc12.x86_64.rpm
cluster-glue-debuginfo-1.0.3-1.fc12.x86_64.rpm
cluster-glue-libs-1.0.3-1.fc12.x86_64.rpm
corosync-1.2.0-1.fc12.x86_64.rpm
corosync-debuginfo-1.2.0-1.fc12.x86_64.rpm
corosynclib-1.2.0-1.fc12.x86_64.rpm
heartbeat-3.0.2-2.fc12.x86_64.rpm
heartbeat-debuginfo-3.0.2-2.fc12.x86_64.rpm
heartbeat-libs-3.0.2-2.fc12.x86_64.rpm
pacemaker-1.0.7-4.fc12.x86_64.rpm
pacemaker-debuginfo-1.0.7-4.fc12.x86_64.rpm
pacemaker-libs-1.0.7-4.fc12.x86_64.rpm

and I'm facing a problem on start of any resource :
restofencenode1_start_0 (node=node3, call=4, rc=1, status=complete):
unknown error

/var/log/messages gives :
Feb 15 15:30:39 node3 pengine: [3876]: WARN: unpack_rsc_op: Processing
failed op restofencenode1_start_0 on node3: unknown error (1)

and crm_verify -L -V :
crm_verify[3991]: 2010/02/15_15:37:20 WARN: unpack_rsc_op: Processing
failed op restofencenode1_start_0 on node3: unknown error (1)
crm_verify[3991]: 2010/02/15_15:37:20 WARN: common_apply_stickiness:
Forcing restofencenode1 away from node3 after 1000000 failures (max=1000000)
crm_verify[3991]: 2010/02/15_15:37:20 WARN: native_color: Resource
restofencenode1 cannot run anywhere
crm_verify[3991]: 2010/02/15_15:37:20 WARN: native_color: Resource
restofencenode3 cannot run anywhere
Warnings found during check: config may not be valid

I can't find the problem, as I do as I did with fc11 rpms , meaning that
my restofencenode1 is in cib.xml like :
    <resources>
      <primitive class="stonith" id="restofencenode1" type="external/ipmi">
        <meta_attributes id="restofencenode1-meta_attributes">
          <nvpair id="restofencenode1-meta_attributes-target-role"
name="target-role" value="Started"/>
          <nvpair id="restofencenode1-meta_attributes-hostname"
name="hostname" value="node3"/>
          <nvpair id="restofencenode1-meta_attributes-ipaddr"
name="ipaddr" value="12.1.1.121"/>
          <nvpair id="restofencenode1-meta_attributes-userid"
name="userid" value="mylogin"/>
          <nvpair id="restofencenode1-meta_attributes-password"
name="password" value="mypass"/>
          <nvpair id="restofencenode1-meta_attributes-interface"
name="interface" value="lan"/>
        </meta_attributes>
      </primitive>

I've also tried with a simple lsb resource, and I've got the same result .

Note that I have the same problem with the official releases provided
with fc12 :
cluster-glue-1.0-0.11.b79635605337.hg.fc12.x86_64.rpm
cluster-glue-libs-1.0-0.11.b79635605337.hg.fc12.x86_64.rpm
corosync-1.1.2-1.fc12.x86_64.rpm
corosynclib-1.1.2-1.fc12.x86_64.rpm
heartbeat-3.0.0-0.5.0daab7da36a8.hg.fc12.x86_64.rpm
heartbeat-libs-3.0.0-0.5.0daab7da36a8.hg.fc12.x86_64.rpm
openais-1.1.0-1.fc12.x86_64.rpm
openaislib-1.1.0-1.fc12.x86_64.rpm
pacemaker-1.0.5-4.fc12.x86_64.rpm
pacemaker-libs-1.0.5-4.fc12.x86_64.rpm
pacemaker-libs-devel-1.0.5-4.fc12.x86_64.rpm

so I guess I've make a mistake somewhere, but can't find it ...

Thanks a lot if someone can help me.
Regards
Alain

_______________________________________________
Linux-HA mailing list
[hidden email]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Need some help on a problem with Pacemaker/corosync on fc12

Dejan Muhamedagic
Hi,

On Mon, Feb 15, 2010 at 04:17:27PM +0100, Alain.Moulle wrote:

> Hi,
>
> I've retrieved the following rpms for fc12 :
> cluster-glue-1.0.3-1.fc12.x86_64.rpm
> cluster-glue-debuginfo-1.0.3-1.fc12.x86_64.rpm
> cluster-glue-libs-1.0.3-1.fc12.x86_64.rpm
> corosync-1.2.0-1.fc12.x86_64.rpm
> corosync-debuginfo-1.2.0-1.fc12.x86_64.rpm
> corosynclib-1.2.0-1.fc12.x86_64.rpm
> heartbeat-3.0.2-2.fc12.x86_64.rpm
> heartbeat-debuginfo-3.0.2-2.fc12.x86_64.rpm
> heartbeat-libs-3.0.2-2.fc12.x86_64.rpm
> pacemaker-1.0.7-4.fc12.x86_64.rpm
> pacemaker-debuginfo-1.0.7-4.fc12.x86_64.rpm
> pacemaker-libs-1.0.7-4.fc12.x86_64.rpm
>
> and I'm facing a problem on start of any resource :
> restofencenode1_start_0 (node=node3, call=4, rc=1, status=complete):
> unknown error
>
> /var/log/messages gives :
> Feb 15 15:30:39 node3 pengine: [3876]: WARN: unpack_rsc_op: Processing
> failed op restofencenode1_start_0 on node3: unknown error (1)

There should be more logs. Just grep for the resource id. If not,
then please make a hb_report.

Thanks,

Dejan

> and crm_verify -L -V :
> crm_verify[3991]: 2010/02/15_15:37:20 WARN: unpack_rsc_op: Processing
> failed op restofencenode1_start_0 on node3: unknown error (1)
> crm_verify[3991]: 2010/02/15_15:37:20 WARN: common_apply_stickiness:
> Forcing restofencenode1 away from node3 after 1000000 failures (max=1000000)
> crm_verify[3991]: 2010/02/15_15:37:20 WARN: native_color: Resource
> restofencenode1 cannot run anywhere
> crm_verify[3991]: 2010/02/15_15:37:20 WARN: native_color: Resource
> restofencenode3 cannot run anywhere
> Warnings found during check: config may not be valid
>
> I can't find the problem, as I do as I did with fc11 rpms , meaning that
> my restofencenode1 is in cib.xml like :
>     <resources>
>       <primitive class="stonith" id="restofencenode1" type="external/ipmi">
>         <meta_attributes id="restofencenode1-meta_attributes">
>           <nvpair id="restofencenode1-meta_attributes-target-role"
> name="target-role" value="Started"/>
>           <nvpair id="restofencenode1-meta_attributes-hostname"
> name="hostname" value="node3"/>
>           <nvpair id="restofencenode1-meta_attributes-ipaddr"
> name="ipaddr" value="12.1.1.121"/>
>           <nvpair id="restofencenode1-meta_attributes-userid"
> name="userid" value="mylogin"/>
>           <nvpair id="restofencenode1-meta_attributes-password"
> name="password" value="mypass"/>
>           <nvpair id="restofencenode1-meta_attributes-interface"
> name="interface" value="lan"/>
>         </meta_attributes>
>       </primitive>
>
> I've also tried with a simple lsb resource, and I've got the same result .
>
> Note that I have the same problem with the official releases provided
> with fc12 :
> cluster-glue-1.0-0.11.b79635605337.hg.fc12.x86_64.rpm
> cluster-glue-libs-1.0-0.11.b79635605337.hg.fc12.x86_64.rpm
> corosync-1.1.2-1.fc12.x86_64.rpm
> corosynclib-1.1.2-1.fc12.x86_64.rpm
> heartbeat-3.0.0-0.5.0daab7da36a8.hg.fc12.x86_64.rpm
> heartbeat-libs-3.0.0-0.5.0daab7da36a8.hg.fc12.x86_64.rpm
> openais-1.1.0-1.fc12.x86_64.rpm
> openaislib-1.1.0-1.fc12.x86_64.rpm
> pacemaker-1.0.5-4.fc12.x86_64.rpm
> pacemaker-libs-1.0.5-4.fc12.x86_64.rpm
> pacemaker-libs-devel-1.0.5-4.fc12.x86_64.rpm
>
> so I guess I've make a mistake somewhere, but can't find it ...
>
> Thanks a lot if someone can help me.
> Regards
> Alain
>
> _______________________________________________
> Linux-HA mailing list
> [hidden email]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[hidden email]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Need some help on a problem with Pacemaker/corosync on fc12

Moullé Alain
In reply to this post by Moullé Alain
Hi ,

the other messages are only :

Feb 16 08:00:39 node3 pengine: [3876]: WARN: unpack_rsc_op: Processing
failed op restofencenode1_start_0 on node3: unknown error (1)
Feb 16 08:00:39 node3 pengine: [3876]: notice: native_print:
restofencenode1#011(stonith:external/ipmi):#011Stopped
Feb 16 08:00:39 node3 pengine: [3876]: info: get_failcount:
restofencenode1 has failed 1000000 times on node3
Feb 16 08:00:39 node3 pengine: [3876]: WARN: common_apply_stickiness:
Forcing restofencenode1 away from node3 after 1000000 failures (max=1000000)
Feb 16 08:00:39 node3 pengine: [3876]: WARN: native_color: Resource
restofencenode1 cannot run anywhere
Feb 16 08:00:39 node3 pengine: [3876]: notice: LogActions: Leave
resource restofencenode1#011(Stopped)

knowing that I had set the following constraints :
    <constraints>
      <rsc_location id="loc1-restofencenode1" node="node3"
rsc="restofencenode1" score="+INFINITY"/>
      <rsc_location id="neverloc-restofencenode1" node="node1"
rsc="restofencenode1" score="-INFINITY"/>
      <rsc_location id="loc1-restofencenode3" node="node1"
rsc="restofencenode3" score="+INFINITY"/>
      <rsc_location id="neverloc-restofencenode3" node="node3"
rsc="restofencenode3" score="-INFINITY"/>
    </constraints>


Either I made a big mistake, or there is something wrong in these
releases, because I think I did the
quite the same thing with releases on fc11 and it worked fine ... it was
upon openais (openais.conf)
and not yet upon corosync (corosync.conf) but it seems to be a problem
of resources management
not on cluster management ...

Really thanks for help, because I'm stalled ...

Regards
Alain

> On Mon, Feb 15, 2010 at 04:17:27PM +0100, Alain.Moulle wrote:
>  
>> > Hi,
>> >
>> > I've retrieved the following rpms for fc12 :
>> > cluster-glue-1.0.3-1.fc12.x86_64.rpm
>> > cluster-glue-debuginfo-1.0.3-1.fc12.x86_64.rpm
>> > cluster-glue-libs-1.0.3-1.fc12.x86_64.rpm
>> > corosync-1.2.0-1.fc12.x86_64.rpm
>> > corosync-debuginfo-1.2.0-1.fc12.x86_64.rpm
>> > corosynclib-1.2.0-1.fc12.x86_64.rpm
>> > heartbeat-3.0.2-2.fc12.x86_64.rpm
>> > heartbeat-debuginfo-3.0.2-2.fc12.x86_64.rpm
>> > heartbeat-libs-3.0.2-2.fc12.x86_64.rpm
>> > pacemaker-1.0.7-4.fc12.x86_64.rpm
>> > pacemaker-debuginfo-1.0.7-4.fc12.x86_64.rpm
>> > pacemaker-libs-1.0.7-4.fc12.x86_64.rpm
>> >
>> > and I'm facing a problem on start of any resource :
>> > restofencenode1_start_0 (node=node3, call=4, rc=1, status=complete):
>> > unknown error
>> >
>> > /var/log/messages gives :
>> > Feb 15 15:30:39 node3 pengine: [3876]: WARN: unpack_rsc_op: Processing
>> > failed op restofencenode1_start_0 on node3: unknown error (1)
>>    
>
> There should be more logs. Just grep for the resource id. If not,
> then please make a hb_report.
>
> Thanks,
>
> Dejan
>
>  
>> > and crm_verify -L -V :
>> > crm_verify[3991]: 2010/02/15_15:37:20 WARN: unpack_rsc_op: Processing
>> > failed op restofencenode1_start_0 on node3: unknown error (1)
>> > crm_verify[3991]: 2010/02/15_15:37:20 WARN: common_apply_stickiness:
>> > Forcing restofencenode1 away from node3 after 1000000 failures (max=1000000)
>> > crm_verify[3991]: 2010/02/15_15:37:20 WARN: native_color: Resource
>> > restofencenode1 cannot run anywhere
>> > crm_verify[3991]: 2010/02/15_15:37:20 WARN: native_color: Resource
>> > restofencenode3 cannot run anywhere
>> > Warnings found during check: config may not be valid
>> >
>> > I can't find the problem, as I do as I did with fc11 rpms , meaning that
>> > my restofencenode1 is in cib.xml like :
>> >     <resources>
>> >       <primitive class="stonith" id="restofencenode1" type="external/ipmi">
>> >         <meta_attributes id="restofencenode1-meta_attributes">
>> >           <nvpair id="restofencenode1-meta_attributes-target-role"
>> > name="target-role" value="Started"/>
>> >           <nvpair id="restofencenode1-meta_attributes-hostname"
>> > name="hostname" value="node3"/>
>> >           <nvpair id="restofencenode1-meta_attributes-ipaddr"
>> > name="ipaddr" value="12.1.1.121"/>
>> >           <nvpair id="restofencenode1-meta_attributes-userid"
>> > name="userid" value="mylogin"/>
>> >           <nvpair id="restofencenode1-meta_attributes-password"
>> > name="password" value="mypass"/>
>> >           <nvpair id="restofencenode1-meta_attributes-interface"
>> > name="interface" value="lan"/>
>> >         </meta_attributes>
>> >       </primitive>
>> >
>> > I've also tried with a simple lsb resource, and I've got the same result .
>> >
>> > Note that I have the same problem with the official releases provided
>> > with fc12 :
>> > cluster-glue-1.0-0.11.b79635605337.hg.fc12.x86_64.rpm
>> > cluster-glue-libs-1.0-0.11.b79635605337.hg.fc12.x86_64.rpm
>> > corosync-1.1.2-1.fc12.x86_64.rpm
>> > corosynclib-1.1.2-1.fc12.x86_64.rpm
>> > heartbeat-3.0.0-0.5.0daab7da36a8.hg.fc12.x86_64.rpm
>> > heartbeat-libs-3.0.0-0.5.0daab7da36a8.hg.fc12.x86_64.rpm
>> > openais-1.1.0-1.fc12.x86_64.rpm
>> > openaislib-1.1.0-1.fc12.x86_64.rpm
>> > pacemaker-1.0.5-4.fc12.x86_64.rpm
>> > pacemaker-libs-1.0.5-4.fc12.x86_64.rpm
>> > pacemaker-libs-devel-1.0.5-4.fc12.x86_64.rpm
>> >
>> > so I guess I've make a mistake somewhere, but can't find it ...
>> >
>> > Thanks a lot if someone can help me.
>> > Regards
>> > Alain
>> >
>> >

_______________________________________________
Linux-HA mailing list
[hidden email]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Need some help on a problem with Pacemaker/corosync on fc12

Dejan Muhamedagic
Hi,

On Tue, Feb 16, 2010 at 08:49:30AM +0100, Alain.Moulle wrote:

> Hi ,
>
> the other messages are only :
>
> Feb 16 08:00:39 node3 pengine: [3876]: WARN: unpack_rsc_op: Processing
> failed op restofencenode1_start_0 on node3: unknown error (1)
> Feb 16 08:00:39 node3 pengine: [3876]: notice: native_print:
> restofencenode1#011(stonith:external/ipmi):#011Stopped
> Feb 16 08:00:39 node3 pengine: [3876]: info: get_failcount:
> restofencenode1 has failed 1000000 times on node3
> Feb 16 08:00:39 node3 pengine: [3876]: WARN: common_apply_stickiness:
> Forcing restofencenode1 away from node3 after 1000000 failures (max=1000000)
> Feb 16 08:00:39 node3 pengine: [3876]: WARN: native_color: Resource
> restofencenode1 cannot run anywhere
> Feb 16 08:00:39 node3 pengine: [3876]: notice: LogActions: Leave
> resource restofencenode1#011(Stopped)

No messages from stonithd, strange.

> knowing that I had set the following constraints :
>     <constraints>
>       <rsc_location id="loc1-restofencenode1" node="node3"
> rsc="restofencenode1" score="+INFINITY"/>
>       <rsc_location id="neverloc-restofencenode1" node="node1"
> rsc="restofencenode1" score="-INFINITY"/>
>       <rsc_location id="loc1-restofencenode3" node="node1"
> rsc="restofencenode3" score="+INFINITY"/>
>       <rsc_location id="neverloc-restofencenode3" node="node3"
> rsc="restofencenode3" score="-INFINITY"/>
>     </constraints>
>
>
> Either I made a big mistake, or there is something wrong in these
> releases, because I think I did the
> quite the same thing with releases on fc11 and it worked fine ... it was
> upon openais (openais.conf)
> and not yet upon corosync (corosync.conf) but it seems to be a problem
> of resources management
> not on cluster management ...
>
> Really thanks for help, because I'm stalled ...

Can't say what's going on. Please make a hb_report report from
the time the resources failed (was that the first cluster start
after the upgrade?). If the tarball is big to post, just open a
bugzilla.

Thanks,

Dejan

> Regards
> Alain
> > On Mon, Feb 15, 2010 at 04:17:27PM +0100, Alain.Moulle wrote:
> >  
> >> > Hi,
> >> >
> >> > I've retrieved the following rpms for fc12 :
> >> > cluster-glue-1.0.3-1.fc12.x86_64.rpm
> >> > cluster-glue-debuginfo-1.0.3-1.fc12.x86_64.rpm
> >> > cluster-glue-libs-1.0.3-1.fc12.x86_64.rpm
> >> > corosync-1.2.0-1.fc12.x86_64.rpm
> >> > corosync-debuginfo-1.2.0-1.fc12.x86_64.rpm
> >> > corosynclib-1.2.0-1.fc12.x86_64.rpm
> >> > heartbeat-3.0.2-2.fc12.x86_64.rpm
> >> > heartbeat-debuginfo-3.0.2-2.fc12.x86_64.rpm
> >> > heartbeat-libs-3.0.2-2.fc12.x86_64.rpm
> >> > pacemaker-1.0.7-4.fc12.x86_64.rpm
> >> > pacemaker-debuginfo-1.0.7-4.fc12.x86_64.rpm
> >> > pacemaker-libs-1.0.7-4.fc12.x86_64.rpm
> >> >
> >> > and I'm facing a problem on start of any resource :
> >> > restofencenode1_start_0 (node=node3, call=4, rc=1, status=complete):
> >> > unknown error
> >> >
> >> > /var/log/messages gives :
> >> > Feb 15 15:30:39 node3 pengine: [3876]: WARN: unpack_rsc_op: Processing
> >> > failed op restofencenode1_start_0 on node3: unknown error (1)
> >>    
> >
> > There should be more logs. Just grep for the resource id. If not,
> > then please make a hb_report.
> >
> > Thanks,
> >
> > Dejan
> >
> >  
> >> > and crm_verify -L -V :
> >> > crm_verify[3991]: 2010/02/15_15:37:20 WARN: unpack_rsc_op: Processing
> >> > failed op restofencenode1_start_0 on node3: unknown error (1)
> >> > crm_verify[3991]: 2010/02/15_15:37:20 WARN: common_apply_stickiness:
> >> > Forcing restofencenode1 away from node3 after 1000000 failures (max=1000000)
> >> > crm_verify[3991]: 2010/02/15_15:37:20 WARN: native_color: Resource
> >> > restofencenode1 cannot run anywhere
> >> > crm_verify[3991]: 2010/02/15_15:37:20 WARN: native_color: Resource
> >> > restofencenode3 cannot run anywhere
> >> > Warnings found during check: config may not be valid
> >> >
> >> > I can't find the problem, as I do as I did with fc11 rpms , meaning that
> >> > my restofencenode1 is in cib.xml like :
> >> >     <resources>
> >> >       <primitive class="stonith" id="restofencenode1" type="external/ipmi">
> >> >         <meta_attributes id="restofencenode1-meta_attributes">
> >> >           <nvpair id="restofencenode1-meta_attributes-target-role"
> >> > name="target-role" value="Started"/>
> >> >           <nvpair id="restofencenode1-meta_attributes-hostname"
> >> > name="hostname" value="node3"/>
> >> >           <nvpair id="restofencenode1-meta_attributes-ipaddr"
> >> > name="ipaddr" value="12.1.1.121"/>
> >> >           <nvpair id="restofencenode1-meta_attributes-userid"
> >> > name="userid" value="mylogin"/>
> >> >           <nvpair id="restofencenode1-meta_attributes-password"
> >> > name="password" value="mypass"/>
> >> >           <nvpair id="restofencenode1-meta_attributes-interface"
> >> > name="interface" value="lan"/>
> >> >         </meta_attributes>
> >> >       </primitive>
> >> >
> >> > I've also tried with a simple lsb resource, and I've got the same result .
> >> >
> >> > Note that I have the same problem with the official releases provided
> >> > with fc12 :
> >> > cluster-glue-1.0-0.11.b79635605337.hg.fc12.x86_64.rpm
> >> > cluster-glue-libs-1.0-0.11.b79635605337.hg.fc12.x86_64.rpm
> >> > corosync-1.1.2-1.fc12.x86_64.rpm
> >> > corosynclib-1.1.2-1.fc12.x86_64.rpm
> >> > heartbeat-3.0.0-0.5.0daab7da36a8.hg.fc12.x86_64.rpm
> >> > heartbeat-libs-3.0.0-0.5.0daab7da36a8.hg.fc12.x86_64.rpm
> >> > openais-1.1.0-1.fc12.x86_64.rpm
> >> > openaislib-1.1.0-1.fc12.x86_64.rpm
> >> > pacemaker-1.0.5-4.fc12.x86_64.rpm
> >> > pacemaker-libs-1.0.5-4.fc12.x86_64.rpm
> >> > pacemaker-libs-devel-1.0.5-4.fc12.x86_64.rpm
> >> >
> >> > so I guess I've make a mistake somewhere, but can't find it ...
> >> >
> >> > Thanks a lot if someone can help me.
> >> > Regards
> >> > Alain
> >> >
> >> >
>
> _______________________________________________
> Linux-HA mailing list
> [hidden email]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[hidden email]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
Loading...