Re: [Xen-users] Re: Continuing BUG:-iness booting Fedora Rawhide 3.1.0-rc's (was Summary: Experiences setting up a debug serial port)

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Xen-users] Re: Continuing BUG:-iness booting Fedora Rawhide 3.1.0-rc's (was Summary: Experiences setting up a debug serial port)

Pasi Kärkkäinen
On Mon, Sep 12, 2011 at 04:07:21PM -0400, jim burns wrote:
> On Mon September 12 2011, 10:36:23 AM, Pasi Kärkkäinen wrote:
> > > Sure, thanx, attached. Do you need a debug log also (initcall_debug
> > > debug  loglevel=10)?
> > >
> >
> > Sure, it doesn't hurt..
>
> I'll get it to you in a couple of days.
>

Ok, thanks!

also: I assume you won't get any BUGs when booting the same kernel
on baremetal/native?


> > > The last four BUG:s (from Sep  8 17:12:20 on) were from starting a winxp
> > > domu  (first 3 BUG:s), and destroying it at grub (the last BUG:).
> >
> > Ok, so there are BUGs when booting up, and also when starting HVM guests.
>
> Right, from Sep 8, 17:12 on. The 'comm:'s referenced are xenstored (2x) and
> qemu-dm (1x) when starting the guest (as far as grub), and xenconsoled when
> 'xm destroy'-ing it.
>
> Thanx for your interest.
>

-- Pasi

--
xen mailing list
[hidden email]
https://admin.fedoraproject.org/mailman/listinfo/xen
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Xen-devel] Re: [Xen-users] Continuing BUG:-iness booting Fedora Rawhide 3.1.0-rc's (was Summary: Experiences setting up a debug serial port)

Konrad Rzeszutek Wilk
> Rawhide just came out with 3.1.0-0.rc6.git0.0.fc17.x86_64 , and the BUG:s are
> still the same as in the last log I sent. However, as promised I have attached
> the initcall_debug log, but for rc6.

Hey Jim,

We are quite sure we know the cause of this. I was wondering if you would be
up for beta-testing a patch for this?
--
xen mailing list
[hidden email]
https://admin.fedoraproject.org/mailman/listinfo/xen
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Xen-devel] Re: [Xen-users] Continuing BUG:-iness booting Fedora Rawhide 3.1.0-rc's (was Summary: Experiences setting up a debug serial port)

Konrad Rzeszutek Wilk
On Wed, Sep 14, 2011 at 05:08:06AM -0400, Konrad Rzeszutek Wilk wrote:
> > Rawhide just came out with 3.1.0-0.rc6.git0.0.fc17.x86_64 , and the BUG:s are
> > still the same as in the last log I sent. However, as promised I have attached
> > the initcall_debug log, but for rc6.
>
> Hey Jim,
>
> We are quite sure we know the cause of this. I was wondering if you would be
> up for beta-testing a patch for this?

Specifically this patch seems to fix it for me:


commit 690dc11498b192db25762de77988224753517c96
Author: Konrad Rzeszutek Wilk <[hidden email]>
Date:   Wed Sep 14 05:10:00 2011 -0400

    xen/irq: Alter the locking to be a mutex.
   
    When we allocate/change the IRQ informations, we do not
    need to use a psinlock. We can use a mutex (which is
    what the generic IRQ code does for allocations/changes.
   
    Suggested-by: Ian Campbell <[hidden email]>
    Signed-off-by: Konrad Rzeszutek Wilk <[hidden email]>

diff --git a/drivers/xen/events.c b/drivers/xen/events.c
index da70f5c..7523719 100644
--- a/drivers/xen/events.c
+++ b/drivers/xen/events.c
@@ -54,7 +54,7 @@
  * This lock protects updates to the following mapping and reference-count
  * arrays. The lock does not need to be acquired to read the mapping tables.
  */
-static DEFINE_SPINLOCK(irq_mapping_update_lock);
+static DEFINE_MUTEX(irq_mapping_update_lock);
 
 static LIST_HEAD(xen_irq_list_head);
 
@@ -631,7 +631,7 @@ int xen_bind_pirq_gsi_to_irq(unsigned gsi,
  int irq = -1;
  struct physdev_irq irq_op;
 
- spin_lock(&irq_mapping_update_lock);
+ mutex_lock(&irq_mapping_update_lock);
 
  irq = find_irq_by_gsi(gsi);
  if (irq != -1) {
@@ -684,7 +684,7 @@ int xen_bind_pirq_gsi_to_irq(unsigned gsi,
  handle_edge_irq, name);
 
 out:
- spin_unlock(&irq_mapping_update_lock);
+ mutex_unlock(&irq_mapping_update_lock);
 
  return irq;
 }
@@ -710,7 +710,7 @@ int xen_bind_pirq_msi_to_irq(struct pci_dev *dev, struct msi_desc *msidesc,
 {
  int irq, ret;
 
- spin_lock(&irq_mapping_update_lock);
+ mutex_lock(&irq_mapping_update_lock);
 
  irq = xen_allocate_irq_dynamic();
  if (irq == -1)
@@ -724,10 +724,10 @@ int xen_bind_pirq_msi_to_irq(struct pci_dev *dev, struct msi_desc *msidesc,
  if (ret < 0)
  goto error_irq;
 out:
- spin_unlock(&irq_mapping_update_lock);
+ mutex_unlock(&irq_mapping_update_lock);
  return irq;
 error_irq:
- spin_unlock(&irq_mapping_update_lock);
+ mutex_unlock(&irq_mapping_update_lock);
  xen_free_irq(irq);
  return -1;
 }
@@ -740,7 +740,7 @@ int xen_destroy_irq(int irq)
  struct irq_info *info = info_for_irq(irq);
  int rc = -ENOENT;
 
- spin_lock(&irq_mapping_update_lock);
+ mutex_lock(&irq_mapping_update_lock);
 
  desc = irq_to_desc(irq);
  if (!desc)
@@ -766,7 +766,7 @@ int xen_destroy_irq(int irq)
  xen_free_irq(irq);
 
 out:
- spin_unlock(&irq_mapping_update_lock);
+ mutex_unlock(&irq_mapping_update_lock);
  return rc;
 }
 
@@ -776,7 +776,7 @@ int xen_irq_from_pirq(unsigned pirq)
 
  struct irq_info *info;
 
- spin_lock(&irq_mapping_update_lock);
+ mutex_lock(&irq_mapping_update_lock);
 
  list_for_each_entry(info, &xen_irq_list_head, list) {
  if (info == NULL || info->type != IRQT_PIRQ)
@@ -787,7 +787,7 @@ int xen_irq_from_pirq(unsigned pirq)
  }
  irq = -1;
 out:
- spin_unlock(&irq_mapping_update_lock);
+ mutex_unlock(&irq_mapping_update_lock);
 
  return irq;
 }
@@ -802,7 +802,7 @@ int bind_evtchn_to_irq(unsigned int evtchn)
 {
  int irq;
 
- spin_lock(&irq_mapping_update_lock);
+ mutex_lock(&irq_mapping_update_lock);
 
  irq = evtchn_to_irq[evtchn];
 
@@ -818,7 +818,7 @@ int bind_evtchn_to_irq(unsigned int evtchn)
  }
 
 out:
- spin_unlock(&irq_mapping_update_lock);
+ mutex_unlock(&irq_mapping_update_lock);
 
  return irq;
 }
@@ -829,7 +829,7 @@ static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu)
  struct evtchn_bind_ipi bind_ipi;
  int evtchn, irq;
 
- spin_lock(&irq_mapping_update_lock);
+ mutex_lock(&irq_mapping_update_lock);
 
  irq = per_cpu(ipi_to_irq, cpu)[ipi];
 
@@ -853,7 +853,7 @@ static int bind_ipi_to_irq(unsigned int ipi, unsigned int cpu)
  }
 
  out:
- spin_unlock(&irq_mapping_update_lock);
+ mutex_unlock(&irq_mapping_update_lock);
  return irq;
 }
 
@@ -878,7 +878,7 @@ int bind_virq_to_irq(unsigned int virq, unsigned int cpu)
  struct evtchn_bind_virq bind_virq;
  int evtchn, irq;
 
- spin_lock(&irq_mapping_update_lock);
+ mutex_lock(&irq_mapping_update_lock);
 
  irq = per_cpu(virq_to_irq, cpu)[virq];
 
@@ -903,7 +903,7 @@ int bind_virq_to_irq(unsigned int virq, unsigned int cpu)
  }
 
 out:
- spin_unlock(&irq_mapping_update_lock);
+ mutex_unlock(&irq_mapping_update_lock);
 
  return irq;
 }
@@ -913,7 +913,7 @@ static void unbind_from_irq(unsigned int irq)
  struct evtchn_close close;
  int evtchn = evtchn_from_irq(irq);
 
- spin_lock(&irq_mapping_update_lock);
+ mutex_lock(&irq_mapping_update_lock);
 
  if (VALID_EVTCHN(evtchn)) {
  close.port = evtchn;
@@ -943,7 +943,7 @@ static void unbind_from_irq(unsigned int irq)
 
  xen_free_irq(irq);
 
- spin_unlock(&irq_mapping_update_lock);
+ mutex_unlock(&irq_mapping_update_lock);
 }
 
 int bind_evtchn_to_irqhandler(unsigned int evtchn,
@@ -1279,7 +1279,7 @@ void rebind_evtchn_irq(int evtchn, int irq)
    will also be masked. */
  disable_irq(irq);
 
- spin_lock(&irq_mapping_update_lock);
+ mutex_lock(&irq_mapping_update_lock);
 
  /* After resume the irq<->evtchn mappings are all cleared out */
  BUG_ON(evtchn_to_irq[evtchn] != -1);
@@ -1289,7 +1289,7 @@ void rebind_evtchn_irq(int evtchn, int irq)
 
  xen_irq_info_evtchn_init(irq, evtchn);
 
- spin_unlock(&irq_mapping_update_lock);
+ mutex_unlock(&irq_mapping_update_lock);
 
  /* new event channels are always bound to cpu 0 */
  irq_set_affinity(irq, cpumask_of(0));

--
xen mailing list
[hidden email]
https://admin.fedoraproject.org/mailman/listinfo/xen

mutex_instead_of_spinlock.patch (5K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Xen-devel] Re: [Xen-users] Continuing BUG:-iness booting Fedora Rawhide 3.1.0-rc's (was Summary: Experiences setting up a debug serial port)

Konrad Rzeszutek Wilk
On Wed, Sep 14, 2011 at 05:07:28PM -0400, jim burns wrote:

> On Wed September 14 2011, 6:57:11 AM, Konrad Rzeszutek Wilk wrote:
> > On Wed, Sep 14, 2011 at 05:08:06AM -0400, Konrad Rzeszutek Wilk wrote:
> > > > Rawhide just came out with 3.1.0-0.rc6.git0.0.fc17.x86_64 , and the
> > > > BUG:s are  still the same as in the last log I sent. However, as
> > > > promised I have attached the initcall_debug log, but for rc6.
> > >
> > >
> > >
> > > Hey Jim,
> > >
> > >
> > >
> > > We are quite sure we know the cause of this. I was wondering if you
> > > would be up for beta-testing a patch for this?
> >
> > Specifically this patch seems to fix it for me:
> >
> >
> > commit 690dc11498b192db25762de77988224753517c96
> > Author: Konrad Rzeszutek Wilk <[hidden email]>
> > Date:   Wed Sep 14 05:10:00 2011 -0400
> >     xen/irq: Alter the locking to be a mutex.
>
> I'll try to apply this to fedora's xen src rpm over the weekend. In case it
> doesn't apply, would you remind me of the git commands for the code you
> applied this patch to? Thanx.

I just save the email in some mbox file and then do
git am < <the saved mbox file> and it should automatically add it.

Or you can just do patch -p1 <the saved mbox file> and it will apply it too.

--
xen mailing list
[hidden email]
https://admin.fedoraproject.org/mailman/listinfo/xen
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Xen-devel] Re: [Xen-users] Continuing BUG:-iness booting Fedora Rawhide 3.1.0-rc's (was Summary: Experiences setting up a debug serial port)

M A Young
On Wed, 14 Sep 2011, Konrad Rzeszutek Wilk wrote:

> On Wed, Sep 14, 2011 at 05:07:28PM -0400, jim burns wrote:
>> On Wed September 14 2011, 6:57:11 AM, Konrad Rzeszutek Wilk wrote:
>>> On Wed, Sep 14, 2011 at 05:08:06AM -0400, Konrad Rzeszutek Wilk wrote:
>>>
>>> commit 690dc11498b192db25762de77988224753517c96
>>> Author: Konrad Rzeszutek Wilk <[hidden email]>
>>> Date:   Wed Sep 14 05:10:00 2011 -0400
>>>     xen/irq: Alter the locking to be a mutex.
>>
>> I'll try to apply this to fedora's xen src rpm over the weekend. In case it
>> doesn't apply, would you remind me of the git commands for the code you
>> applied this patch to? Thanx.
>
> I just save the email in some mbox file and then do
> git am < <the saved mbox file> and it should automatically add it.
>
> Or you can just do patch -p1 <the saved mbox file> and it will apply it too.

I have a (temporary) F17 kernel with the patch from
http://oss.oracle.com/git/kwilk/xen.git/?p=kwilk/xen.git;a=commit;h=a7079a6404ed2106315327fff6be3464d10814e7 
(which I assume is the same patch) applied building at
http://koji.fedoraproject.org/koji/taskinfo?taskID=3352367
if you want to test it.

  Michael Young
--
xen mailing list
[hidden email]
https://admin.fedoraproject.org/mailman/listinfo/xen
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Xen-devel] Re: [Xen-users] Continuing BUG:-iness booting Fedora Rawhide 3.1.0-rc's (was Summary: Experiences setting up a debug serial port)

Konrad Rzeszutek Wilk
On Thu, Sep 15, 2011 at 12:38:26AM +0100, M A Young wrote:

> On Wed, 14 Sep 2011, Konrad Rzeszutek Wilk wrote:
>
> >On Wed, Sep 14, 2011 at 05:07:28PM -0400, jim burns wrote:
> >>On Wed September 14 2011, 6:57:11 AM, Konrad Rzeszutek Wilk wrote:
> >>>On Wed, Sep 14, 2011 at 05:08:06AM -0400, Konrad Rzeszutek Wilk wrote:
> >>>
> >>>commit 690dc11498b192db25762de77988224753517c96
> >>>Author: Konrad Rzeszutek Wilk <[hidden email]>
> >>>Date:   Wed Sep 14 05:10:00 2011 -0400
> >>>    xen/irq: Alter the locking to be a mutex.
> >>
> >>I'll try to apply this to fedora's xen src rpm over the weekend. In case it
> >>doesn't apply, would you remind me of the git commands for the code you
> >>applied this patch to? Thanx.
> >
> >I just save the email in some mbox file and then do
> >git am < <the saved mbox file> and it should automatically add it.
> >
> >Or you can just do patch -p1 <the saved mbox file> and it will apply it too.
>
> I have a (temporary) F17 kernel with the patch from http://oss.oracle.com/git/kwilk/xen.git/?p=kwilk/xen.git;a=commit;h=a7079a6404ed2106315327fff6be3464d10814e7
> (which I assume is the same patch) applied building at

Yup!

> http://koji.fedoraproject.org/koji/taskinfo?taskID=3352367
> if you want to test it.

Great! Thanks for doing this.
>
> Michael Young
--
xen mailing list
[hidden email]
https://admin.fedoraproject.org/mailman/listinfo/xen
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Xen-devel] Re: [Xen-users] Continuing BUG:-iness booting Fedora Rawhide 3.1.0-rc's (was Summary: Experiences setting up a debug serial port)

Konrad Rzeszutek Wilk
In reply to this post by M A Young
On Wed, Sep 14, 2011 at 09:18:25PM -0400, jim burns wrote:

> On Thu September 15 2011, 12:38:26 AM, M A Young wrote:
> > I have a (temporary) F17 kernel with the patch from
> > http://oss.oracle.com/git/kwilk/xen.git/?p=kwilk/xen.git;a=commit;h=a7079a64
> > 04ed2106315327fff6be3464d10814e7  (which I assume is the same patch) applied
> > building at
> > http://koji.fedoraproject.org/koji/taskinfo?taskID=3352367
> > if you want to test it.
>
> Oh - the patch is to the kernel, not to xen. I misread it.
>
> Ok - I downloaded
> http://koji.fedoraproject.org/koji/getfile?taskID=3352368&name=kernel-3.1.0-0.rc6.git0.0.xendom0.fc17.x86_64.rpm
>
> I hope that's right. I'll try it tomorrow. Thanx.

ping?
--
xen mailing list
[hidden email]
https://admin.fedoraproject.org/mailman/listinfo/xen
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Xen-devel] Re: [Xen-users] Continuing BUG:-iness booting Fedora Rawhide 3.1.0-rc's (was Summary: Experiences setting up a debug serial port)

Konrad Rzeszutek Wilk
In reply to this post by Konrad Rzeszutek Wilk
> Testing out myoung's 3.1.0 was not as straight forward as I had hoped. It did
> boot up without any BUG:s, but I did get the occasional Lock Order message.
> Log snippet at the end of the post. It doesn't seem to be directly related to
> starting guests.

Yeah, those look like the network card (b44 driver) is at fault.
>
> The real problem comes in starting up guests. Performance is very bad. I knew
> from working with rawhide 3.0.0 (long since replaced) that performance would
> suffer - rawhide kernels are debug kernels:

Right. They are horrendously slow.

>
> jimb@insp6400 09/16/11 10:16AM:~
> [511] > grep DEBUG /boot/config-3.1.0-0.rc6.git0.0.xendom0.fc17.x86_64|grep -v
> 'is not set'|wc -l
> 91
> jimb@insp6400 09/16/11 10:16AM:~
> [512] > grep DEBUG /boot/config-2.6.40.4-5.fc15.x86_64|grep -v 'is not set'|wc
> -l          
> 54
> jimb@insp6400 09/16/11 10:18AM:~
> [513] > grep DEBUG /boot/config-3.0.1-3.fc16.x86_64|grep -v 'is not set'|wc -l
> 90
>
> Starting guests is much slower under myoung's 3.1.0 than under rawhide's 3.1.0
> or 3.0.{0,1}. A cifs backed pv domu took 6 min. for 'xm create' to exit,

a debug kernel which will indeed be quite slow.

> root@insp6400 09/16/11 12:09AM:~
> [544] > xl create Documents/winxp; brctl show;  ps -A|grep qemu; netstat -tlp|
> grep 59; renice -11 `pidof qemu-dm`;  ps -A|grep vncv; ifconfig vif1.0 mtu
> 9000
> Parsing config file Documents/winxp
> xc: info: VIRTUAL MEMORY ARRANGEMENT:
>   Loader:        0000000000100000->000000000017b270
>   TOTAL:         0000000000000000->000000003fc00000
>   ENTRY ADDRESS: 00000000001015a0
> xc: error: Could not allocate memory for HVM guest. (16 = Device or resource
> busy): Internal error
> libxl: error: libxl_dom.c:284:libxl__build_hvm hvm building failed

How much memory do you have used for your dom0/domU?

>
> and my serial debug log had several:
>
> (XEN) memory.c:133:d0 Could not allocate order=9 extent: id=1 memflags=0 (1 of
> 4)
> (XEN) memory.c:133:d0 Could not allocate order=9 extent: id=1 memflags=0 (0 of
> 4)
> (XEN) memory.c:133:d0 Could not allocate order=9 extent: id=1 memflags=0 (0 of
> 4)
> (XEN) memory.c:133:d0 Could not allocate order=9 extent: id=1 memflags=0 (0 of
> 4)
> (XEN) memory.c:133:d0 Could not allocate order=9 extent: id=1 memflags=0 (0 of
> 4)
> (XEN) memory.c:133:d0 Could not allocate order=9 extent: id=1 memflags=0 (0 of
> 4)
> (XEN) memory.c:133:d0 Could not allocate order=9 extent: id=1 memflags=0 (0 of
> 4)
> (XEN) memory.c:133:d0 Could not allocate order=9 extent: id=1 memflags=0 (0 of
> 4)
> (XEN) memory.c:133:d0 Could not allocate order=9 extent: id=1 memflags=0 (0 of
> 4)
> (XEN) memory.c:133:d0 Could not allocate order=9 extent: id=1 memflags=0 (0 of
> 4)
> (XEN) memory.c:133:d0 Could not allocate order=9 extent: id=1 memflags=0 (0 of
> 4)
> (XEN) memory.c:133:d0 Could not allocate order=9 extent: id=1 memflags=0 (0 of
> 4)
> (XEN) memory.c:133:d0 Could not allocate order=9 extent: id=1 memflags=0 (0 of
> 4)
> (XEN) memory.c:133:d0 Could not allocate order=9 extent: id=1 memflags=0 (0 of
> 4)
> (XEN) memory.c:133:d0 Could not allocate order=9 extent: id=1 memflags=0 (0 of
> 4)
> (XEN) memory.c:133:d0 Could not allocate order=9 extent: id=1 memflags=0 (0 of
> 4)
> (XEN) memory.c:133:d0 Could not allocate order=9 extent: id=1 memflags=0 (0 of
> 4)
> (XEN) memory.c:133:d0 Could not allocate order=9 extent: id=1 memflags=0 (0 of
> 4)
> (XEN) memory.c:133:d0 Could not allocate order=9 extent: id=1 memflags=0 (0 of
> 4)
> (XEN) memory.c:133:d0 Could not allocate order=9 extent: id=1 memflags=0 (0 of
> 4)
> (XEN) memory.c:133:d0 Could not allocate order=9 extent: id=1 memflags=0 (0 of
> 4)
> (XEN) memory.c:133:d0 Could not allocate order=9 extent: id=1 memflags=0 (0 of
> 4)
> (XEN) memory.c:133:d0 Could not allocate order=9 extent: id=1 memflags=0 (0 of
> 4)
> (XEN) memory.c:133:d0 Could not allocate order=9 extent: id=1 memflags=0 (0 of
> 4)
> (XEN) memory.c:133:d0 Could not allocate order=0 extent: id=1 memflags=0 (439
> of 2048)
>
> Then I remembered that I recently upped the memory allocation for my winxp
> domu, from 512 to 768. This works fine under 2.6.40, the f15 non-debug
> production kernel. None the less, I knocked the allocation back down to 512,
> and my winxp domu did start up, getting to the qemu splash screen in about 2 -
> 3 min., during part of which dom0 was unresponsive. However, I'm still getting
> the '(XEN) memory.c' errors, and some frequent GPF errors (a few a min.) in my
> serial debug log:
>
> (XEN) traps.c:2956: GPF (0060): ffff82c48015354a -> ffff82c480200131
>
> Then, rawhide and gplpv don't get along. Specifically, the xennet receive side
> driver stops working, and I have to fall back to qemu emulation. It takes
> about an hour for the winxp desktop to finish initializing, with dom0 cpu load
> on one cpu core at 72% - yum! But I'll just have to live with it - it's not
> your problem. I'll leave it up for at least a day to see if any other messages
> pop up.

Keep in mind that the patch for the <title> is going in 3.1, so it will show
up in FC16 at some point.

You can also rebuild your kernel without the debug options..
--
xen mailing list
[hidden email]
https://admin.fedoraproject.org/mailman/listinfo/xen
Loading...